# HNSW

> Source: https://aiwiki.ai/wiki/hnsw
> Updated: 2026-06-21
> Categories: Algorithms, Information Retrieval, Machine Learning
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Hierarchical Navigable Small World** (**HNSW**) is an [approximate nearest-neighbor](/wiki/approximate_nearest_neighbor) search algorithm that indexes high-dimensional vectors in a multi-layer proximity graph and answers similarity queries in expected `O(log N)` time. It was introduced by **Yury Malkov** and **Dmitry Yashunin** in a March 2016 arXiv preprint and formally published in *IEEE Transactions on Pattern Analysis and Machine Intelligence* in 2018 [1]. The authors describe the method as "fully graph-based, without any need for additional search structures," and report that it "is able to strongly outperform previous opensource state-of-the-art vector-only approaches" [1]. HNSW is the single most widely deployed algorithm for high-dimensional vector similarity search and is the default index in most modern [vector databases](/wiki/vector_database) and [semantic search](/wiki/semantic_search) systems, including [Pinecone](/wiki/pinecone), [Weaviate](/wiki/weaviate), [Qdrant](/wiki/qdrant), [Milvus](/wiki/milvus), [Vespa](/wiki/vespa), [Elasticsearch](/wiki/elasticsearch), [OpenSearch](/wiki/opensearch), [Redis](/wiki/redis), and PostgreSQL's [pgvector](/wiki/pgvector) extension [2][3][4][5].

The algorithm builds a layered graph where each vector is a node connected to a bounded number of neighbors. Higher layers contain progressively fewer points and serve as a coarse navigation structure, while the bottom layer contains every vector and provides fine-grained refinement. A nearest-neighbor query starts at the top, performs greedy descent toward the query, and finishes with a beam search at the base layer. Construction is incremental: each insertion samples a top layer from a geometric distribution, descends through the existing graph, and links the new point to its nearest neighbors at each layer where it appears. Performance scales as `O(log N)` in the expected case for both insertion and query; recall and throughput are tuned through three main hyperparameters: `M`, `efConstruction`, and `efSearch`.

HNSW has become a central component of the modern [retrieval-augmented generation](/wiki/retrieval_augmented_generation_rag) stack, where dense [vector embeddings](/wiki/vector_embeddings) produced by language and multimodal models are indexed and queried at low latency to supply external context to large language models.

## What problem does HNSW solve?

Given a set of `N` points in a metric space and a query point `q`, the [nearest-neighbor search](/wiki/nearest_neighbor_search) problem asks for the point in the set closest to `q` under a chosen distance function. Exact methods such as [k-d trees](/wiki/kd_tree) and [ball trees](/wiki/ball_tree) work well in low dimensions but degrade rapidly as dimensionality grows. Beyond roughly twenty dimensions, distance-based pruning loses effectiveness and tree methods devolve to near-linear scans. This curse of dimensionality motivates approximate algorithms that trade a small amount of recall for orders-of-magnitude gains in speed and scalability [6][7].

Dense [embeddings](/wiki/embeddings) used in modern AI routinely live in 100 to 3,000 dimensions. Common embedding models such as [text-embedding-3](/wiki/text_embedding_3) from OpenAI, [BGE](/wiki/bge), [E5](/wiki/e5), and various [sentence-transformers](/wiki/sentence_transformers) checkpoints output 384, 768, 1,024, 1,536, or 3,072 dimensional vectors. At these dimensionalities, exact search over millions or billions of points is too slow for interactive applications.

Before HNSW, several families of approximate methods were in widespread use:

- **[Locality-sensitive hashing](/wiki/locality_sensitive_hashing)** ([LSH](/wiki/lsh)): hash functions designed so that nearby points collide with high probability. LSH offers strong theoretical guarantees but typically has lower empirical recall per byte of memory than graph-based methods.
- **[Inverted file index](/wiki/inverted_file_index)** (IVF): partition the vector space using k-means and search only the partitions closest to the query. Often combined with quantization.
- **[Product quantization](/wiki/product_quantization)** (PQ): split each vector into subvectors and quantize each independently, enabling compact codes and fast asymmetric distance computation.
- **Navigable Small World** (NSW): the predecessor of HNSW, proposed by Malkov, Ponomarenko, Logvinov, and Krylov in 2014 [8]. NSW maintains a single proximity graph that grows by inserting each new point and connecting it to its nearest neighbors among already-inserted points. NSW reached state-of-the-art accuracy on several benchmarks but suffered from polylogarithmic search complexity with a meaningful constant and from poor performance on clustered data.

HNSW addresses these limitations by introducing a hierarchical structure that explicitly separates long-range and short-range edges across distinct layers. As the paper puts it, the method produces graphs "having the links separated by their characteristic distance scales," and "starting search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling" [1]. On the [ANN-Benchmarks](https://ann-benchmarks.com/) leaderboards maintained by Aumueller, Bernhardsson, and Faltings, HNSW consistently sits on the Pareto frontier of recall-versus-throughput trade-offs across diverse datasets, including SIFT-1M, GIST-1M, and the deep image embeddings benchmark [9].

## What ideas is HNSW built on?

HNSW draws on three threads of prior work:

1. **Watts-Strogatz small-world networks** [10]. In their 1998 *Nature* paper, Watts and Strogatz showed that adding a small fraction of long-range random edges to a regular lattice produces a graph with high local clustering and short average path length, with random walks reaching any target in `O(log N)` expected steps. HNSW exploits this property at each layer to enable fast greedy routing from any starting point. See [small-world network](/wiki/small_world_network) and [Watts-Strogatz](/wiki/watts_strogatz).

2. **Kleinberg's navigable small worlds** [11]. Jon [Kleinberg](/wiki/kleinberg) proved in 2000 that a small-world graph is *navigable* by a decentralized greedy algorithm only if its long-range edges follow a specific distribution: in a `d`-dimensional grid, the probability of a long-range edge to a vertex at distance `r` should fall off as `r^(-d)`. HNSW's heuristic neighbor selection approximates this property in a metric space without requiring an explicit grid embedding.

3. **[Skip lists](/wiki/skip_list)** [12]. The skip list, introduced by William Pugh in 1990, is a probabilistic data structure that maintains multiple linked lists at increasing levels of coarseness. HNSW transposes this idea from one-dimensional ordered keys to general metric spaces: instead of a chain of skip pointers, each layer is a small-world proximity graph; instead of comparison-based jumps, each step follows the edge to the closest neighbor. The paper notes that this "similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation" [1].

The synthesis of these ideas, plus a layer-assignment distribution and a heuristic neighbor selection rule, is the core contribution of the HNSW paper.

## How does the HNSW algorithm work?

### Multi-layer structure

HNSW maintains a graph `G` partitioned into layers `0, 1, 2, ...`. Layer 0 contains every point in the index. Each higher layer is a strict subset of the layer below it, and the number of points in each layer decays exponentially. Edges exist only within a layer: a point present in layers 0 through `l` has a separate adjacency list at each layer.

For every point, the maximum layer is sampled at insertion time from a geometric distribution. Concretely, the insertion algorithm draws a uniform random number `u` in `(0, 1]` and computes:

`level = floor(-ln(u) * mL)`

where `mL = 1 / ln(M)` is a normalization constant. With this choice, the probability of being assigned to layer `l` decays as `1 / M^l`, so layer 1 contains roughly `1 / M` of the points, layer 2 roughly `1 / M^2`, and so on. The expected total number of layers is `O(log_M N)`, which gives the algorithm its logarithmic search complexity.

A single global entry point is maintained, pointing to a node in the topmost non-empty layer. When a freshly inserted point reaches a layer above the current entry point, the entry point is updated.

### Insertion

The insertion procedure for a new point `q` proceeds as follows:

1. **Sample top layer** `L` for `q` using the geometric formula above.
2. **Greedy phase across upper layers.** Starting from the global entry point `ep` in the current top layer, descend layer by layer down to layer `L + 1`. At each layer, perform a greedy walk: repeatedly move to the neighbor of the current node closest to `q` until no neighbor is closer. Pass the closest node down to the next layer as the new entry point.
3. **Beam search across lower layers.** From layer `min(L, top)` down to layer 0, perform an `efConstruction`-sized search. The search maintains two priority queues: a candidate set `C` and a dynamic result set `W` of the `efConstruction` closest nodes seen so far. At each step, the closest unvisited candidate is popped from `C`, its neighbors are visited, and any neighbor closer than the worst element of `W` (or while `W` is not full) is inserted into both `C` and `W`. The search terminates when no candidate in `C` is closer than the worst element of `W`.
4. **Select neighbors.** From the resulting candidate set, select up to `M` (or `Mmax0` at layer 0) neighbors. The original paper proposes two heuristics: a simple top-`M` selection, and a diversity-aware heuristic that excludes a candidate if the new edge would be redundant given an already-selected neighbor closer to that candidate. The diversity heuristic is the default in `hnswlib` and is one reason HNSW outperforms simpler proximity graphs on clustered data.
5. **Bidirectional connection and pruning.** Add edges between `q` and the selected neighbors. For each neighbor whose degree now exceeds `Mmax` (or `Mmax0` at layer 0), prune its adjacency list using the same heuristic.

The upper-layer descent costs `O(log N)` distance computations on average, while the layer-0 beam search dominates wall-clock cost and is controlled by `efConstruction`.

### Search

The k-nearest-neighbor query uses the same machinery as insertion, with two passes:

1. Starting at the global entry point in the topmost layer, perform a greedy walk down to layer 1, always moving to the neighbor closest to the query.
2. At layer 0, run a beam search with dynamic candidate-list size `efSearch`, returning the `k` closest points found.

Increasing `efSearch` produces better recall by exploring more of the layer-0 graph, at the cost of more distance computations per query. Because `efSearch` is set per query, recall and throughput can be tuned at runtime without rebuilding the index.

A range-search variant is also supported, with the beam search continuing until no candidate is closer than a user-supplied radius.

## What are the HNSW hyperparameters?

HNSW has a small number of hyperparameters that determine its memory footprint, build time, and query performance. The table below summarizes typical recommended ranges drawn from the paper, `hnswlib`, and Faiss documentation [1][2][13].

| Parameter | Meaning | Typical range | Effect |
|---|---|---|---|
| `M` | Maximum out-degree per node at layers `> 0` | 8 to 64 | Higher `M` increases recall, memory, and build time |
| `Mmax0` | Maximum out-degree at layer 0 | typically `2 * M` | Layer 0 connections dominate memory; default in `hnswlib` is `2 * M` |
| `efConstruction` | Dynamic candidate list size during insertion | 64 to 800 | Higher values improve graph quality; build time grows roughly linearly |
| `efSearch` (`ef`) | Dynamic candidate list size during query | 16 to 800 | Tunable per query; trades recall against latency |
| `mL` | Layer-assignment normalization | `1 / ln(M)` | Default rarely changed; controls layer growth rate |
| `seed` | RNG seed for layer sampling | implementation-defined | Determines reproducibility |

Most implementations default to `M = 16` and `efConstruction = 200`. For high-dimensional embedding workloads, `M = 32` or `M = 48` with `efConstruction = 400` is common. `efSearch` is tuned per workload to hit a target recall.

## Which distance metrics does HNSW support?

The HNSW algorithm is metric-agnostic; it only requires a function returning a non-negative real number for any pair of points. The standard distance functions supported by mainstream implementations are:

- **[Euclidean distance](/wiki/euclidean_distance)** (`L2`), the squared or unsquared `L2` norm of the difference vector.
- **Inner product**, used either as a similarity (negated for use as a distance) or paired with normalization to yield cosine similarity.
- **[Cosine similarity](/wiki/cosine_similarity)**, equivalent to inner product on `L2`-normalized vectors.
- **Hamming distance** for binary vectors.

The original paper notes that HNSW is also well suited to non-metric similarity measures, since neither construction nor search relies on the triangle inequality except as a heuristic. The companion library [nmslib](/wiki/nmslib) was designed for exactly this case [14]. In production vector databases the dominant distance is cosine, because most modern embedding models are trained with cosine or inner-product objectives.

## What is the complexity of HNSW?

Let `N` be the number of indexed points, `d` the dimensionality of each vector, and assume distance computations cost `O(d)`.

- **Insertion**: `O(d * efConstruction * log N)` expected. The `log N` factor arises from the average number of layers; at each layer the beam search examines roughly `efConstruction` nodes.
- **Search**: `O(d * efSearch * log N)` expected, with `efSearch` substituted for `efConstruction` at layer 0.
- **Memory**: `O(N * d)` for vectors plus `O(N * M)` graph edges. Each edge in `hnswlib` consumes 4 bytes (a 32-bit node id). For `M = 16`, this is around 192 bytes of graph per node on top of raw vector storage, often 1.0 to 1.5 times the raw vector size for typical embedding dimensionalities.
- **Index build** scales as `O(N * d * efConstruction * log N)`. Building an index over 100 million 768-dimensional vectors with `M = 32` and `efConstruction = 200` typically takes hours on a single multi-core machine.

Empirically, for a fixed recall target HNSW offers some of the best query throughput per unit memory of any in-memory ANN method, especially on `L2` and cosine metrics in moderate-to-high dimensions [9].

## Where is HNSW implemented?

HNSW has been implemented in many open-source libraries and embedded inside most commercial vector databases.

| Implementation | Language | License | Notes |
|---|---|---|---|
| [hnswlib](/wiki/hnswlib) | C++ with Python bindings | Apache 2.0 | Reference implementation by Yury Malkov; header-only library [2] |
| [nmslib](/wiki/nmslib) | C++ with Python bindings | Apache 2.0 | Broader Non-Metric Space Library that includes HNSW and many other methods [14] |
| [Faiss](/wiki/faiss) | C++ with Python bindings | MIT | [Meta](/wiki/meta) AI Research; provides `IndexHNSWFlat`, `IndexHNSWPQ`, `IndexHNSWSQ` variants combining HNSW with quantization [13] |
| [Lucene](/wiki/lucene) 9.1+ | Java | Apache 2.0 | Apache Lucene shipped HNSW in March 2022; powers vector search in Elasticsearch 8.x and OpenSearch [15] |
| [Vespa](/wiki/vespa) | C++ with Java config | Apache 2.0 | Native real-time HNSW with concurrent updates and metadata filtering [4] |
| [Qdrant](/wiki/qdrant) | Rust | Apache 2.0 | Production-grade vector DB built around HNSW with payload filtering [18] |
| [Milvus](/wiki/milvus) | C++/Go | Apache 2.0 | Default index for many workloads; also supports IVF, PQ, DiskANN |
| [Weaviate](/wiki/weaviate) | Go | BSD-3 | Native HNSW with custom tombstone-based deletion |
| [Pinecone](/wiki/pinecone) | Proprietary | closed | Managed vector database; documented as graph-based with HNSW lineage |
| [pgvector](/wiki/pgvector) | C (PostgreSQL extension) | PostgreSQL license | Added HNSW index in version 0.5.0, released August 28, 2023, alongside its earlier IVFFlat index [16] |
| [Redis](/wiki/redis) Stack / RediSearch | C | RSAL | HNSW available as a vector index type in RediSearch |
| Elasticsearch | Java | Elastic License v2 | Wraps Lucene's HNSW; introduced in 8.0, GA in 8.4 |
| OpenSearch | Java | Apache 2.0 | Wraps Lucene's HNSW and Faiss HNSW via the k-NN plugin [19] |

Most implementations follow the original paper closely, with two common deviations: lazy deletion via tombstones, and disk-friendly graph layouts.

## How does HNSW compare to DiskANN, ScaNN, and other methods?

HNSW has inspired and competed with several follow-up approaches:

- **[DiskANN](/wiki/diskann)**. Subramanya, Devvrit, Simhadri, Krishnaswamy, and Kadekodi proposed DiskANN at NeurIPS 2019 [17]. Built on the Vamana graph construction algorithm, it is engineered to keep most of the index on SSD while serving billion-scale workloads with a small RAM footprint. [Microsoft Research](/wiki/microsoft_research) released a reference implementation, and the algorithm underlies Microsoft Bing and Azure Cognitive Search.
- **[NGT](/wiki/ngt)**. Developed at [Yahoo Japan](/wiki/yahoo_japan), NGT is a tree-graph hybrid that uses an ANNG (approximate `k`-NN graph) plus a tree for entry-point selection. NGT-ONNG is competitive with HNSW on some ANN-Benchmarks workloads.
- **[ScaNN](/wiki/scann)**. Google's 2020 algorithm combines IVF-style coarse partitioning with anisotropic vector quantization. It is not graph-based but reaches strong recall-throughput points on certain benchmarks.
- **HNSW with quantization**. `IndexHNSWPQ` in Faiss compresses vectors with [product quantization](/wiki/product_quantization), reducing memory by 4 to 32 times at the cost of recall. Vector databases such as Qdrant and Weaviate also offer 1-bit or 2-bit quantization on top of HNSW, often combined with rescoring on full-precision vectors.
- **NSG and Vamana**. Alternative graph constructions that prune edges aggressively to keep degrees low while maintaining good navigability. Vamana is the construction used by DiskANN.
- **NSW**. The single-layer predecessor [8], largely superseded.

In practice, HNSW remains the default in-memory choice for dense [vector search](/wiki/vector_search) at the scale of millions to a few hundred million vectors. DiskANN tends to win at billion scale when disk residency matters, while quantization-heavy methods like ScaNN are preferred when memory is the binding constraint.

## What are the limitations of HNSW?

Despite its strong default behavior, HNSW has several practical limitations that operators must plan for:

- **Memory pressure.** Most implementations hold both vectors and the graph in RAM. Per-vector overhead beyond the raw embedding ranges from roughly 24 bytes (small `M`, layer 0 only) to several hundred bytes. For 100 million 1,536-dimensional float32 vectors, raw storage alone is around 614 GB; the graph adds tens of gigabytes more.
- **Indexing time.** While the asymptotic complexity is `O(N * log N)`, the constant is non-trivial. Bulk-loading 100 million or more vectors can take hours even on parallelized implementations.
- **Mutability.** HNSW supports inserts well. Deletions are typically implemented as soft tombstones that mark a node as removed but leave the graph intact; the deleted node is still traversed at query time. Hard deletes that reconnect surrounding nodes are expensive; full reindexing is often the cleanest option for large-scale rewrites.
- **Update churn.** Heavy overwrite workloads can degrade graph quality, since stale edges accumulate. Production systems typically schedule periodic compaction or rebuilds.
- **Filtered search.** Combining vector similarity with structured metadata predicates (timestamp ranges, tenant IDs, ACLs) is non-trivial. Pre-filtering can disconnect the graph and force expensive fallbacks; post-filtering wastes work and may miss enough candidates to satisfy the requested `k`. Vector databases use filterable HNSW variants, multi-graph partitioning, and hybrid query planners.
- **Disk residency.** Native HNSW assumes random-access reads to graph and vector data. SSD-based variants exist but generally rely on different graph constructions tuned for sequential access patterns; this is the niche DiskANN was designed to fill.
- **Worst-case behavior.** Recall is not guaranteed in the strict sense; pathological data distributions can leave a few queries with poor results. Increasing `efSearch` typically eliminates these tails at the cost of latency.
- **Patent and licensing.** The `hnswlib` reference implementation is released under the Apache 2.0 license and the algorithm has been published openly. There is no known patent encumbrance on the core method.

## How is HNSW used in AI and LLM workflows?

The rise of large language models and dense retrieval has pulled HNSW from a specialized algorithm into a core piece of production infrastructure. Several patterns recur:

- **Retrieval-augmented generation.** A typical [RAG](/wiki/retrieval_augmented_generation_rag) stack chunks documents into passages, embeds each chunk with a model such as `text-embedding-3-small`, BGE, or E5, and stores the resulting vectors in a vector database with HNSW indexing. At query time, the user prompt is embedded, the top `k` matching passages are retrieved, and they are concatenated into the language model's context window.
- **Semantic search.** Beyond RAG, HNSW indexes power [semantic search](/wiki/semantic_search) over product catalogs, documentation portals, code repositories, and chat history. Many search platforms now ship HNSW alongside classical inverted indices for hybrid lexical-plus-semantic ranking.
- **Multimodal retrieval.** Embeddings produced by CLIP-style models, image encoders, and audio encoders are also indexed with HNSW for cross-modal lookup.
- **Agentic workflows.** Frameworks like [LangChain](/wiki/langchain) and [LlamaIndex](/wiki/llamaindex) abstract over vector databases, but the underlying index is almost always HNSW, either in a managed product such as Pinecone or self-hosted options such as Qdrant, Weaviate, Milvus, or pgvector.
- **On-device and edge.** Small HNSW deployments power local-first applications such as notes apps, IDE plugins, and chat clients that embed documents locally.

## Who created HNSW?

**Yury Malkov** is the lead author of HNSW and of its predecessor NSW. After working at Mail.ru and Sense.lab on large-scale similarity search, he later joined OpenAI. He maintains `hnswlib` and contributed to `nmslib`. His body of work on graph-based ANN search includes both theoretical analysis and the reference implementations most production systems are built on.

**Dmitry Yashunin** co-authored the HNSW paper, with work focused on algorithm design and high-performance computing for similarity search. The Malkov and Yashunin paper has accumulated thousands of citations since its 2016 preprint and is one of the most-cited works in modern information retrieval.

The earlier NSW paper had additional co-authors **Alexander Ponomarenko**, **Andrey Logvinov**, and **Vladimir Krylov** [8].

## See also

- [Approximate nearest neighbor](/wiki/approximate_nearest_neighbor)
- [Vector database](/wiki/vector_database)
- [Vector search](/wiki/vector_search)
- [Semantic search](/wiki/semantic_search)
- [Locality-sensitive hashing](/wiki/locality_sensitive_hashing)
- [Inverted file index](/wiki/inverted_file_index)
- [Product quantization](/wiki/product_quantization)
- [DiskANN](/wiki/diskann)
- [ScaNN](/wiki/scann)
- [Faiss](/wiki/faiss)
- [pgvector](/wiki/pgvector)
- [Retrieval-augmented generation](/wiki/retrieval_augmented_generation_rag)

## References

[1] Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. *IEEE Transactions on Pattern Analysis and Machine Intelligence*, 42(4), 824-836. arXiv:1603.09320 (first version 2016). https://arxiv.org/abs/1603.09320

[2] hnswlib: header-only C++/Python implementation by Yury Malkov. https://github.com/nmslib/hnswlib

[3] Pinecone documentation, vector database concepts. https://docs.pinecone.io/

[4] Vespa documentation, approximate nearest neighbor search using HNSW. https://docs.vespa.ai/en/approximate-nn-hnsw.html

[5] Weaviate documentation, ANN index types and HNSW configuration. https://weaviate.io/developers/weaviate/concepts/vector-index

[6] Beyer, K., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When Is Nearest Neighbor Meaningful? *International Conference on Database Theory*. https://link.springer.com/chapter/10.1007/3-540-49257-7_15

[7] Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. *STOC 1998*. https://dl.acm.org/doi/10.1145/276698.276876

[8] Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V. (2014). Approximate nearest neighbor algorithm based on navigable small world graphs. *Information Systems*, 45, 61-68. https://www.sciencedirect.com/science/article/abs/pii/S0306437913001300

[9] Aumueller, M., Bernhardsson, E., & Faltings, A. ANN-Benchmarks: a benchmarking tool for approximate nearest neighbor algorithms. https://ann-benchmarks.com/

[10] Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. *Nature*, 393, 440-442. https://www.nature.com/articles/30918

[11] Kleinberg, J. M. (2000). The small-world phenomenon: an algorithmic perspective. *STOC 2000*. https://dl.acm.org/doi/10.1145/335305.335325

[12] Pugh, W. (1990). Skip lists: a probabilistic alternative to balanced trees. *Communications of the ACM*, 33(6), 668-676. https://dl.acm.org/doi/10.1145/78973.78977

[13] Faiss documentation, indexing 1G vectors and HNSW index types. https://github.com/facebookresearch/faiss/wiki/Indexing-1G-vectors

[14] Boytsov, L., & Naidan, B. nmslib: Non-Metric Space Library. https://github.com/nmslib/nmslib

[15] Apache Lucene 9.1 release notes, KnnVectorQuery and HNSW. https://lucene.apache.org/core/9_1_0/changes/Changes.html

[16] pgvector 0.5.0 release notes, HNSW index support (released August 28, 2023). https://github.com/pgvector/pgvector/releases/tag/v0.5.0

[17] Subramanya, S. J., Devvrit, F., Simhadri, H. V., Krishnaswamy, R., & Kadekodi, R. (2019). DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. *NeurIPS 2019*. https://proceedings.neurips.cc/paper/2019/hash/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Abstract.html

[18] Qdrant documentation, HNSW index parameters and filtering. https://qdrant.tech/documentation/concepts/indexing/

[19] OpenSearch k-NN plugin documentation. https://opensearch.org/docs/latest/search-plugins/knn/index/