Up to 3x faster stored-vector queries in Elasticsearch

Finding documents similar to a stored vector in Elasticsearch used to require two round trips: Fetch the vector with GET, and then send it back in a k-nearest neighbor (kNN) query. Elasticsearch 9.4 collapses that flow into one request with query_vector_builder.lookup, simplifying the API and improving latency by up to 3x in a two-node Google Cloud Platform (GCP) benchmark.

Why stored-vector search used to require two requests

Previously, when you wanted to find documents similar to a stored vector, you needed to:

Call GET to fetch the vector value from Elasticsearch.
Call _search referencing that vector value in Elasticsearch:
- Serialize the vector value via JSON.

This means paying serialization and network costs twice:

Serialization and deserialization of the vector for both requests.
Network latency costs in both directions.
Potential egress costs in cloud deployments.

In Python, the pattern would be:

from elasticsearch import Elasticsearch

es = Elasticsearch(HOST)

# 1) Fetch the seed vector from Elasticsearch
seed_doc = es.get(
    index=source_index,
    id=seed_id,
    _source_includes=[vector_field],
)
query_vector = seed_doc["_source"][vector_field]

# 2) Send it back in a kNN query
resp = es.search(
    index=target_index,
    query={
        "knn": {
            "field": vector_field,
            "k": 10,
            "num_candidates": 100,
            "query_vector": query_vector,
        }
    },
)

While these two calls seem cheap, the overhead is unnecessary. Let’s make this better.

How query_vector_builder.lookup works in Elasticsearch 9.4

In Elasticsearch 9.4, we added lookup to simplify the API and eliminate unnecessary costs:

from elasticsearch import Elasticsearch

es = Elasticsearch(HOST)

resp = es.search(
    index="products",
    query={
        "knn": {
            "field": "product-vector",
            "k": 10,
            "num_candidates": 100,
            "query_vector_builder": {
                "lookup": {
                    "index": "seed-products",
                    "id": "product-123",
                    "path": "product-vector"
                }
            },
        }
    },
)

This request now grabs the dense_vector value stored in the product-vector field, in the document with ID product-123 in the seed-products index. This example is a “more like this” search, finding the nearest vectors to the one with ID product-123. You can refer to any index, effectively using lookup as a query vector store.

How much latency lookup vector search can remove

The goal is to simplify the experience and make it faster. The performance gains aren't just from eliminating the client round trip. Many Elasticsearch instances involve multiple nodes, and traffic between nodes can carry its own serialization and network costs. Elasticsearch actively biases execution toward the local node, which cuts network serialization costs on the server side, too.

To illustrate the potential performance improvements, here’s a benchmark we ran. We used a modified version of our so_vector, where instead of using the query vectors, one path did the GET and then _search pattern and the other used lookup. Running on two nodes in the same zone in GCP, the results were strong. Latency was consistently improved by almost 3x. Even when nodes are within the same data center and the same availability zone, network and serialization costs can have a real impact.

Percentile	get-then-knn (ms)	lookup-knn (ms)	Reduction	Speedup
p50	10.3796	3.14093	69.74%	3.30x
p90	25.4429	5.89807	76.82%	4.31x
p99	27.7167	8.07109	70.88%	3.43x
max (p100)	28.522	12.6497	55.65%	2.25x

This benchmark ran with 2M documents, and the latency improvement will depend on your overall search costs. Even when the speedup is smaller, lookup still removes the extra client-side request. Less code, fewer round trips.

A simpler path for stored-vector search

Sometimes small changes can have an outsized impact. While this is a simple feature, I hope it removes some unnecessary friction in your Elasticsearch usage and makes us that much more lovable.

Ready to try this out on your own? Start a free trial.

Elasticsearch has integrations for tools from LangChain, Cohere and more. Join our advanced semantic search webinar to build your next GenAI app!

Related content

Multilingual image search with Jina CLIP v2 and Elasticsearch

Integrations Vector Database

June 1, 2026

Multilingual image search with Jina CLIP v2 and Elasticsearch

Build a multilingual image search system using Jina CLIP v2 and Elasticsearch. Query your image collection in 89 languages with no translation pipeline, and use Matryoshka Representations to cut index size by 75%

By: Jeffrey Rengifo

Elasticsearch DiskBBQ: 40% faster vector scoring with native SIMD Blocks

Vector Database

June 2, 2026

Elasticsearch DiskBBQ: 40% faster vector scoring with native SIMD Blocks

A deep dive into how DiskBBQ's block layout, doc ID compression modes and native SIMD kernels combine to deliver 40% improved vector scoring throughput for DiskBBQ in 9.4.

By: Benjamin Trent

Cutting Elasticsearch DiskBBQ query quantization time by 5x

Vector Database

May 27, 2026

Cutting Elasticsearch DiskBBQ query quantization time by 5x

See how asymmetric quantization cuts DiskBBQ query quantization overhead from about 20% to 4% with little recall impact.

BT TV

By: Benjamin Trent and Thomas Veasey

How we doubled vector search throughput on Elasticsearch Serverless

Vector Database Inside Elastic

May 28, 2026

How we doubled vector search throughput on Elasticsearch Serverless

How we brought Elasticsearch's native SIMD scoring engine to serverless, and why serverless is where vector search innovation happens next.

CH LD

By: Chris Hegarty and Lorenzo Dematte

Elasticsearch Vector DiskBBQ filter search is now 3–5x faster

Vector Database

May 13, 2026

Elasticsearch Vector DiskBBQ filter search is now 3–5x faster

Learn how Elasticsearch 9.4 makes restrictive filtered DiskBBQ vector search 3–5x faster and more stable by avoiding wasted centroid and postings-list work when selectivity is high.

By: Benjamin Trent

Up to 3x faster stored-vector queries in Elasticsearch

Why stored-vector search used to require two requests

How query_vector_builder.lookup works in Elasticsearch 9.4

How much latency lookup vector search can remove

A simpler path for stored-vector search

Related content

Multilingual image search with Jina CLIP v2 and Elasticsearch

Elasticsearch DiskBBQ: 40% faster vector scoring with native SIMD Blocks

Cutting Elasticsearch DiskBBQ query quantization time by 5x

How we doubled vector search throughput on Elasticsearch Serverless

Elasticsearch Vector DiskBBQ filter search is now 3–5x faster

Ready to build state of the art search experiences?