Hi! I am Xiao Luo, a PhD student in Computer Science and Engineering at The Ohio State University. My research focuses on building efficient, accurate, and scalable vector database systems for large-scale similarity search.
More specifically, I am interested in the following research directions:
Efficient Similarity Evaluation
Designing high-performance distance evaluation primitives by leveraging modern CPU vectorization techniques (e.g., AVX2, AVX-512), with a focus on accelerating small-table lookups and quantized distance computations such as Product Quantization (PQ) and Scalar Quantization (SQ).Efficient Approximate Nearest Neighbor Search
Developing approximate nearest neighbor search systems that tightly integrate graph-based indexing structures with quantization techniques, enabling fast and accurate retrieval.Generalization and Robustness
Building robust and scalable vector database systems that generalize well across diverse datasets and application scenarios, ensuring stable performance without extensive per-dataset tuning.
Education
Project
Accelerating Quantized Graph Approximate Nearest Neighbor Search
(Under Review at VLDB 2026)
OQGLib
- Investigated the use of advanced SIMD instructions (e.g., AVX2, AVX-512) to accelerate quantized approximate nearest neighbor (ANN) search on graph-based indices.
- Proposed a novel SIMD-based quantization scan technique that supports high-bit quantization per subspace and high-precision distance approximation, maximizing distance evaluation accuracy while preserving SIMD efficiency.
- Refine graph-based ANN indices with quantization-aware designs, achieving both high accuracy and high throughput.
- By combining SIMD-accelerated quantization, memory prefetching and management, and graph degree augmentation, designed a new ANN index that achieves at most 6× speedup (4x in average) over recent state-of-the-art systems (e.g., VSAG, SymphonyQG, Glass-NSG, etc.) across 30+ datasets, including SIFT, GIST, GloVe, and Tiny5M.
ANN Indices Ensemble Analysis
(Under Review at VLDB 2026)
- Studied when and how to train multiple ANN indices and ensemble their candidate results to improve robustness and retrieval accuracy.
- Analyzed the trade-offs between index size, construction cost, and search accuracy in multi-index ANN systems.
- Demonstrated that an ensemble of multiple small indices (e.g., two HNSW indices with modest construction and search parameters) can achieve accuracy comparable to a single large index, while requiring only ~30% of the construction time.
Award
- We are in the 5th place in 2023ACM/IEEE TinyML Design Contest at ICCAD
I completed features engineering, and neuron network architecture search, and implemented float16 quantization on devices by hand to save memory.
Code
Fun Fact
I enjoy solving algorithmic problems in my spare time and write blog posts to document my thoughts and solutions.




