An intuitive way to search for images is to use queries composed of an example image and a complementary text. While the first provides rich and implicit context for the search, the latter explicitly calls for new traits, or specifies how some elements of the example image should be changed to retrieve the desired target image. This is the problem of image search with free-form text modifiers, which consists in ranking a collection of images by relevance with respect to a bi-modal query.