Hi! I am Xiao Luo, a PhD student in Computer Science and Engineering at The Ohio State University. My research focuses on building efficient, accurate, and scalable vector database systems for large-scale similarity search.

More specifically, I am interested in the following research directions:

  • Efficient Similarity Evaluation
    Designing high-performance distance evaluation primitives by leveraging modern CPU vectorization techniques (e.g., AVX2, AVX-512), with a focus on accelerating small-table lookups and quantized distance computations such as Product Quantization (PQ) and Scalar Quantization (SQ).

  • Efficient Approximate Nearest Neighbor Search
    Developing approximate nearest neighbor search systems that tightly integrate graph-based indexing structures with quantization techniques, enabling fast and accurate retrieval.

  • Generalization and Robustness
    Building robust and scalable vector database systems that generalize well across diverse datasets and application scenarios, ensuring stable performance without extensive per-dataset tuning.

Education

OSU logo
PhD of Computer Science and Engineering
The Ohio State University
2024 – Present
GT logo
Master of Electrical and Computer Engineering
Georgia Institute of Technology, Atlanta, USA
2022 - 2024
SCU logo
Bachelor of Engineering, Software Engineering
Sichuan University, Chengdu, China
2018 - 2022

Project

(Under Review at VLDB 2026)
OQGLib

  • Investigated the use of advanced SIMD instructions (e.g., AVX2, AVX-512) to accelerate quantized approximate nearest neighbor (ANN) search on graph-based indices.
  • Proposed a novel SIMD-based quantization scan technique that supports high-bit quantization per subspace and high-precision distance approximation, maximizing distance evaluation accuracy while preserving SIMD efficiency.
  • Refine graph-based ANN indices with quantization-aware designs, achieving both high accuracy and high throughput.
  • By combining SIMD-accelerated quantization, memory prefetching and management, and graph degree augmentation, designed a new ANN index that achieves at most 6× speedup (4x in average) over recent state-of-the-art systems (e.g., VSAG, SymphonyQG, Glass-NSG, etc.) across 30+ datasets, including SIFT, GIST, GloVe, and Tiny5M.

ANN Indices Ensemble Analysis

(Under Review at VLDB 2026)

  • Studied when and how to train multiple ANN indices and ensemble their candidate results to improve robustness and retrieval accuracy.
  • Analyzed the trade-offs between index size, construction cost, and search accuracy in multi-index ANN systems.
  • Demonstrated that an ensemble of multiple small indices (e.g., two HNSW indices with modest construction and search parameters) can achieve accuracy comparable to a single large index, while requiring only ~30% of the construction time.

Award

Fun Fact

I enjoy solving algorithmic problems in my spare time and write blog posts to document my thoughts and solutions.