Vector quantization(VQ) is a very effective way to save bandwidth and storage for speech coding and image coding. Traditional vector quantization methods can be divided into mainly seven types, tree-structured VQ,direct sum VQ, Cartesian product VQ, lattice VQ, classified VQ, feedback VQ, and fuzzy VQ, according to their codebook generation procedures. Over the past decade, quantization-based approximate nearest neighbor(ANN)search has been developing very fast and many methods have emerged for searching images with binary codes in the memory for large-scale datasets. Their most impressive characteristics are the use of multiple codebooks. This leads to the appearance of two kinds of codebook: the linear combination codebook and the joint codebook. This may be a trend for the future. However, these methods are just finding a balance among speed, accuracy, and memory consumption for ANN search, and sometimes one of these three suffers. So, finding a vector quantization method that can strike a balance between speed and accuracy and consume moderately sized memory, is still a problem requiring study.
The boom of Internet and multimedia technology leads to the explosion of multimedia information, especially image, which has created an urgent need of quickly retrieving similar and interested images from huge image collections. The content-based high-dimensional indexing mechanism holds the key to achieving this goal by efficiently organizing the content of images and storing them in computer memory. In the past decades, many important developments in high-dimensional image indexing technologies have occurred to cope with the 'curse of dimensionality'. The high-dimensional indexing mechanisms can mainly be divided into three categories: tree-based index, hashing-based index, and visual words based inverted index. In this paper we review the technologies with respect to these three categories of mechanisms, and make several recommendations for future research issues.