Answer context similarity
AnswerContextSimilarityEvaluator #
Bases: BaseEvaluator
Measures how much context are related to the given answer. A higher value suggests a greater proportion of the context is present in the LLM's response.
Attributes:
| Name | Type | Description |
|---|---|---|
embed_model |
BaseEmbedding
|
The embedding model used to compute vector representations. |
similarity_mode |
SimilarityMode
|
Similarity strategy to use. Supported options are
|
score_threshold |
float
|
Determining whether a context segment "passes". Must be between 0.0 and 1.0. Defaults to |
Example
from beekeeper.core.evaluation import AnswerContextSimilarityEvaluator
from beekeeper.embedding.huggingface import HuggingFaceEmbedding
embedding = HuggingFaceEmbedding()
answer_ctx_evaluator = AnswerContextSimilarityEvaluator(embed_model=embedding)
evaluate #
evaluate(query: str | None = None, generated_text: str | None = None, contexts: list[str] | None = None, **kwargs: Any) -> dict
Evaluate the given inputs and return evaluation results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
generated_text
|
str
|
LLM response based on given context. |
None
|
contexts
|
list[str]
|
List of contexts used to generate LLM response. |
None
|
Example
evaluation_result = answer_ctx_evaluator.evaluate(
contexts=["context 1", "context 2"],
generated_text="The capital of France is Paris.",
)
print(f"Score: {evaluation_result['score']}")
print(f"Passing: {evaluation_result['passing']}")