Elasticsearch

ElasticsearchVectorStore #

Bases: BaseVectorStore

Provides functionality to interact with Elasticsearch for storing and querying document embeddings.

Attributes:

Name	Type	Description
`index_name`	`str`	Name of the Elasticsearch index.
`url`	`str`	Elasticsearch instance URL.
`embed_model`	`BaseEmbedding`	Embedding model used to compute vectors.
`user`	`str`	Elasticsearch username.
`password`	`str`	Elasticsearch password.
`batch_size`	`int`	Batch size for bulk operations. Defaults to `200`.
`ssl`	`bool`	Whether to use SSL. Defaults to `False`.
`distance_strategy`	`str`	Distance strategy for similarity search. Currently supports `"cosine"`, `"dot_product"`, and `"l2_norm"`. Defaults to `cosine`.
`text_field`	`str`	Name of the field containing text. Defaults to `text`.
`vector_field`	`str`	Name of the field containing vector embeddings. Defaults to `embedding`.

Example

from beekeeper.embeddings.huggingface import HuggingFaceEmbedding
from beekeeper.vector_stores.elasticsearch import ElasticsearchVectorStore

embedding = HuggingFaceEmbedding()
es_vector_store = ElasticsearchVectorStore(
    index_name="beekeeper-index",
    url="http://localhost:9200",
    embed_model=embedding,
)

model_post_init #

model_post_init(__context)

Initialize Elasticsearch client after Pydantic validation.

add_documents #

add_documents(documents: list[Document], create_index_if_not_exists: bool = True) -> list[str]

Add documents to the Elasticsearch index.

Parameters:

Name	Type	Description	Default
`documents`	`list[Document]`	List of documents to add to the index.	required
`create_index_if_not_exists`	`bool`	Whether to create the index if it doesn't exist. Defaults to `True`.	`True`

query_documents #

query_documents(query: str, top_k: int = 4) -> list[DocumentWithScore]

Performs a similarity search for the top-k most similar documents.

Parameters:

Name	Type	Description	Default
`query`	`str`	Query text.	required
`top_k`	`int`	Number of top results to return. Defaults to `4`.	`4`

Returns:

Type	Description
`list[DocumentWithScore]`	list[DocumentWithScore]: List of the most similar documents.

delete_documents #

delete_documents(ids: list[str]) -> None

Delete documents from the Elasticsearch index.

Parameters:

Name	Type	Description	Default
`ids`	`list[str]`	List of documents IDs to delete.	required

get_all_documents #

get_all_documents(include_fields: list[str] | None = None) -> list[Document]

Get all documents from vector store.