Skip to content

Elasticsearch

ElasticsearchVectorStore #

Bases: BaseVectorStore

Provides functionality to interact with Elasticsearch for storing and querying document embeddings.

Attributes:

Name Type Description
index_name str

Name of the Elasticsearch index.

url str

Elasticsearch instance URL.

embed_model BaseEmbedding

Embedding model used to compute vectors.

user str

Elasticsearch username.

password str

Elasticsearch password.

batch_size int

Batch size for bulk operations. Defaults to 200.

ssl bool

Whether to use SSL. Defaults to False.

distance_strategy str

Distance strategy for similarity search. Currently supports "cosine", "dot_product", and "l2_norm". Defaults to cosine.

text_field str

Name of the field containing text. Defaults to text.

vector_field str

Name of the field containing vector embeddings. Defaults to embedding.

Example
from beekeeper.embeddings.huggingface import HuggingFaceEmbedding
from beekeeper.vector_stores.elasticsearch import ElasticsearchVectorStore

embedding = HuggingFaceEmbedding()
es_vector_store = ElasticsearchVectorStore(
    index_name="beekeeper-index",
    url="http://localhost:9200",
    embed_model=embedding,
)

model_post_init #

model_post_init(__context)

Initialize Elasticsearch client after Pydantic validation.

add_documents #

add_documents(documents: list[Document], create_index_if_not_exists: bool = True) -> list[str]

Add documents to the Elasticsearch index.

Parameters:

Name Type Description Default
documents list[Document]

List of documents to add to the index.

required
create_index_if_not_exists bool

Whether to create the index if it doesn't exist. Defaults to True.

True

query_documents #

query_documents(query: str, top_k: int = 4) -> list[DocumentWithScore]

Performs a similarity search for the top-k most similar documents.

Parameters:

Name Type Description Default
query str

Query text.

required
top_k int

Number of top results to return. Defaults to 4.

4

Returns:

Type Description
list[DocumentWithScore]

list[DocumentWithScore]: List of the most similar documents.

delete_documents #

delete_documents(ids: list[str]) -> None

Delete documents from the Elasticsearch index.

Parameters:

Name Type Description Default
ids list[str]

List of documents IDs to delete.

required

get_all_documents #

get_all_documents(include_fields: list[str] | None = None) -> list[Document]

Get all documents from vector store.