We consider image cIassification on a large-scale, i.e.when millions of images are involved. First, we study image classjfication accuracy as a function of the signature dimensionality and the training set size. We show experimentally that the larger the training set the more highdimensional signatures make a difference. Second, we explore data compression on very large signatures (on the order of 10^5 dimensions). We show how the gain in storage can be traded against a loss in accuracy and or an increase in CPU cost. We experiment with two lossy compression strategies: a dimensionality reduction technique known as the hash kernel and an encoding technique based on product quantizers. We report results on very large databases showing that we can reduce the storage of our signatures by a factor 64 to 128 with little loss in accuracy.