watsonx.governance
Monitors#
WatsonxPromptMonitor #
Bases: PromptObservability
Provides functionality to interact with IBM watsonx.governance for monitoring prompts executed within IBM watsonx.ai LLMs.
Note
One of the following parameters is required to create a prompt monitor:
project_id or space_id, but not both.
Attributes:
| Name | Type | Description |
|---|---|---|
api_key |
str
|
The API key for IBM watsonx.governance. |
space_id |
str
|
The space ID in watsonx.governance. |
project_id |
str
|
The project ID in watsonx.governance. |
region |
Region
|
The region where watsonx.governance is hosted when using IBM Cloud.
Defaults to |
cpd_creds |
CloudPakforDataCredentials
|
The Cloud Pak for Data environment credentials. |
subscription_id |
str
|
The subscription ID associated with the records being logged. |
service_instance_id |
str
|
The service instance ID. |
Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region
from beekeeper.observability.watsonx import (
WatsonxPromptMonitor,
CloudPakforDataCredentials,
)
# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxPromptMonitor(
api_key="API_KEY", space_id="SPACE_ID", region=Region.US_SOUTH
)
# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
url="CPD_URL",
username="USERNAME",
password="PASSWORD",
version="5.2",
instance_id="openshift",
)
wxgov_client = WatsonxPromptMonitor(space_id="SPACE_ID", cpd_creds=cpd_creds)
model_post_init #
model_post_init(__context: Any) -> None
Initialize computed fields after Pydantic validation.
create_prompt_monitor #
create_prompt_monitor(name: str, model_id: str, task_id: TaskType | str, description: str = '', model_parameters: dict | None = None, prompt_template: PromptTemplate | str | None = None, prompt_variables: list[str] | None = None, locale: str = 'en', context_fields: list[str] | None = None, question_field: str | None = None) -> dict
Creates an IBM Prompt Template Asset and ssetup monitor for the given prompt template asset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the Prompt Template Asset. |
required |
model_id
|
str
|
The ID of the model associated with the prompt. |
required |
task_id
|
TaskType
|
The task identifier. |
required |
description
|
str
|
A description of the Prompt Template Asset. |
''
|
model_parameters
|
dict
|
A dictionary of model parameters and their respective values. |
None
|
prompt_template
|
PromptTemplate
|
The prompt template. |
None
|
prompt_variables
|
list[str]
|
A list of values for prompt input variables. |
None
|
locale
|
str
|
Locale code for the input/output language. eg. "en", "pt", "es". |
'en'
|
context_fields
|
list[str]
|
A list of fields that will provide context to the prompt.
Applicable only for the |
None
|
question_field
|
str
|
The field containing the question to be answered.
Applicable only for the |
None
|
Example
from beekeeper.observability.watsonx.supporting_classes.enums import TaskType
wxgov_client.create_prompt_monitor(
name="IBM prompt template",
model_id="ibm/granite-3-2b-instruct",
task_id=TaskType.RETRIEVAL_AUGMENTED_GENERATION,
prompt_template="You are a helpful AI assistant that provides clear and accurate answers. {context}. Question: {input_query}.",
prompt_variables=["context", "input_query"],
context_fields=["context"],
question_field="input_query",
)
store_payload_records #
store_payload_records(request_records: list[dict], subscription_id: str | None = None) -> list[str]
Stores records to the payload logging system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_records
|
list[dict]
|
A list of records to be logged. Each record is represented as a dictionary. |
required |
subscription_id
|
str
|
The subscription ID associated with the records being logged. |
None
|
Example
wxgov_client.store_payload_records(
request_records=[
{
"context1": "value_context1",
"context2": "value_context2",
"input_query": "What's Beekeeper Framework?",
"generated_text": "Beekeeper is a data framework to make AI easier to work with.",
"input_token_count": 25,
"generated_token_count": 150,
}
],
subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)
store_feedback_records #
store_feedback_records(request_records: list[dict], subscription_id: str | None = None) -> dict
Stores records to the feedback logging system.
Info
- For prompt monitors created using Beekeeper, the label field is
reference_output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_records
|
list[dict]
|
A list of records to be logged, where each record is represented as a dictionary. |
required |
subscription_id
|
str
|
The subscription ID associated with the records being logged. |
None
|
Example
wxgov_client.store_feedback_records(
request_records=[
{
"context1": "value_context1",
"context2": "value_context2",
"input_query": "What's Beekeeper Framework?",
"reference_output": "Beekeeper is a data framework to make AI easier to work with."
"generated_text": "Beekeeper is a data framework to make AI easier to work with.",
}
],
subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)
WatsonxExternalPromptMonitor #
Bases: PromptObservability
Provides functionality to interact with IBM watsonx.governance for monitoring prompts executed on external LLMs.
Note
One of the following parameters is required to create a prompt monitor:
project_id or space_id, but not both.
Attributes:
| Name | Type | Description |
|---|---|---|
api_key |
str
|
The API key for IBM watsonx.governance. |
space_id |
str
|
The space ID in watsonx.governance. |
project_id |
str
|
The project ID in watsonx.governance. |
region |
Region
|
The region where watsonx.governance is hosted when using IBM Cloud.
Defaults to |
cpd_creds |
CloudPakforDataCredentials
|
The Cloud Pak for Data environment credentials. |
subscription_id |
str
|
The subscription ID associated with the records being logged. |
service_instance_id |
str
|
The service instance ID. |
Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region
from beekeeper.observability.watsonx import (
WatsonxExternalPromptMonitor,
CloudPakforDataCredentials,
)
# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxExternalPromptMonitor(
api_key="API_KEY", space_id="SPACE_ID", region=Region.US_SOUTH
)
# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
url="CPD_URL",
username="USERNAME",
password="PASSWORD",
version="5.2",
instance_id="openshift",
)
wxgov_client = WatsonxExternalPromptMonitor(
space_id="SPACE_ID", cpd_creds=cpd_creds
)
model_post_init #
model_post_init(__context: Any) -> None
Initialize computed fields after Pydantic validation.
create_prompt_monitor #
create_prompt_monitor(name: str, model_id: str, task_id: TaskType | str, detached_model_provider: str, description: str = '', model_parameters: dict | None = None, detached_model_name: str | None = None, detached_model_url: str | None = None, detached_prompt_url: str | None = None, detached_prompt_additional_info: dict | None = None, prompt_template: PromptTemplate | str | None = None, prompt_variables: list[str] | None = None, locale: str = 'en', context_fields: list[str] | None = None, question_field: str | None = None) -> dict
Creates a detached (external) prompt template asset and attaches a monitor to the specified prompt template asset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the External Prompt Template Asset. |
required |
model_id
|
str
|
The ID of the model associated with the prompt. |
required |
task_id
|
TaskType
|
The task identifier. |
required |
detached_model_provider
|
str
|
The external model provider. |
required |
description
|
str
|
A description of the External Prompt Template Asset. |
''
|
model_parameters
|
dict
|
Model parameters and their respective values. |
None
|
detached_model_name
|
str
|
The name of the external model. |
None
|
detached_model_url
|
str
|
The URL of the external model. |
None
|
detached_prompt_url
|
str
|
The URL of the external prompt. |
None
|
detached_prompt_additional_info
|
dict
|
Additional information related to the external prompt. |
None
|
prompt_template
|
PromptTemplate
|
The prompt template. |
None
|
prompt_variables
|
list[str]
|
Values for the prompt variables. |
None
|
locale
|
str
|
Locale code for the input/output language. eg. "en", "pt", "es". |
'en'
|
context_fields
|
list[str]
|
A list of fields that will provide context to the prompt. Applicable only for "retrieval_augmented_generation" task type. |
None
|
question_field
|
str
|
The field containing the question to be answered. Applicable only for "retrieval_augmented_generation" task type. |
None
|
Example
from beekeeper.observability.watsonx.supporting_classes.enums import TaskType
wxgov_client.create_prompt_monitor(
name="Detached prompt (model AWS Anthropic)",
model_id="anthropic.claude-v2",
task_id=TaskType.RETRIEVAL_AUGMENTED_GENERATION,
detached_model_provider="AWS Bedrock",
detached_model_name="Anthropic Claude 2.0",
detached_model_url="https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html",
prompt_template="You are a helpful AI assistant that provides clear and accurate answers. {context}. Question: {input_query}.",
prompt_variables=["context", "input_query"],
context_fields=["context"],
question_field="input_query",
)
store_payload_records #
store_payload_records(request_records: list[dict], subscription_id: str | None = None) -> list[str]
Stores records to the payload logging system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_records
|
list[dict]
|
A list of records to be logged, where each record is represented as a dictionary. |
required |
subscription_id
|
str
|
The subscription ID associated with the records being logged. |
None
|
Example
wxgov_client.store_payload_records(
request_records=[
{
"context1": "value_context1",
"context2": "value_context2",
"input_query": "What's Beekeeper Framework?",
"generated_text": "Beekeeper is a data framework to make AI easier to work with.",
"input_token_count": 25,
"generated_token_count": 150,
}
],
subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)
store_feedback_records #
store_feedback_records(request_records: list[dict], subscription_id: str | None = None) -> dict
Stores records to the feedback logging system.
Info
- Feedback data for external prompt must include the model output named
generated_text. - For prompt monitors created using Beekeeper, the label field is
reference_output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_records
|
list[dict]
|
A list of records to be logged, where each record is represented as a dictionary. |
required |
subscription_id
|
str
|
The subscription ID associated with the records being logged. |
None
|
Example
wxgov_client.store_feedback_records(
request_records=[
{
"context1": "value_context1",
"context2": "value_context2",
"input_query": "What's Beekeeper Framework?",
"reference_output": "Beekeeper is a data framework to make AI easier to work with."
"generated_text": "Beekeeper is a data framework to make AI easier to work with.",
}
],
subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)
Custom Metrics#
WatsonxCustomMetricsManager #
Bases: BaseModel
Provides functionality to set up a custom metric to measure your model's performance with IBM watsonx.governance.
Attributes:
| Name | Type | Description |
|---|---|---|
api_key |
str
|
The API key for IBM watsonx.governance. |
region |
Region
|
The region where watsonx.governance is hosted when using IBM Cloud.
Defaults to |
cpd_creds |
CloudPakforDataCredentials
|
IBM Cloud Pak for Data environment credentials. |
service_instance_id |
str
|
The service instance ID. |
Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region
from beekeeper.observability.watsonx import (
WatsonxCustomMetricsManager,
CloudPakforDataCredentials,
)
# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxCustomMetricsManager(
api_key="API_KEY", region=Region.US_SOUTH
)
# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
url="CPD_URL",
username="USERNAME",
password="PASSWORD",
version="5.2",
instance_id="openshift",
)
wxgov_client = WatsonxCustomMetricsManager(cpd_creds=cpd_creds)
associate_monitor_instance #
associate_monitor_instance(integrated_system_id: str, monitor_definition_id: str, subscription_id: str)
Associate the specified monitor definition to the specified subscription.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
integrated_system_id
|
str
|
The ID of the integrated system. |
required |
monitor_definition_id
|
str
|
The ID of the custom metric monitor instance. |
required |
subscription_id
|
str
|
The ID of the subscription to associate the monitor with. |
required |
Example
wxgov_client.associate_monitor_instance(
integrated_system_id="019667ca-5687-7838-8d29-4ff70c2b36b0",
monitor_definition_id="custom_llm_quality",
subscription_id="0195e95d-03a4-7000-b954-b607db10fe9e",
)
create_metric_definition #
create_metric_definition(name: str, metrics: list[WatsonxMetricSpec], integrated_system_url: str, integrated_system_credentials: IntegratedSystemCredentials, schedule: bool = False) -> dict[str, Any]
Creates a custom metric definition for IBM watsonx.governance.
This must be done before using custom metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the custom metric group. |
required |
metrics
|
list[WatsonxMetricSpec]
|
A list of metrics to be measured. |
required |
schedule
|
bool
|
Enable or disable the scheduler. Defaults to |
False
|
integrated_system_url
|
str
|
The URL of the external metric provider. |
required |
integrated_system_credentials
|
IntegratedSystemCredentials
|
The credentials for the integrated system. |
required |
Example
from beekeeper.observability.watsonx import (
WatsonxMetricSpec,
IntegratedSystemCredentials,
WatsonxMetricThreshold,
)
wxgov_client.create_metric_definition(
name="Custom Metric - Custom LLM Quality",
metrics=[
WatsonxMetricSpec(
name="context_quality",
applies_to=[
"retrieval_augmented_generation",
"summarization",
],
thresholds=[
WatsonxMetricThreshold(
threshold_type="lower_limit", default_value=0.75
)
],
)
],
integrated_system_url="IS_URL", # URL to the endpoint computing the metric
integrated_system_credentials=IntegratedSystemCredentials(
auth_type="basic", username="USERNAME", password="PASSWORD"
),
)
model_post_init #
model_post_init(__context: Any) -> None
Initialize computed fields after Pydantic validation.
store_metric_data #
store_metric_data(monitor_instance_id: str, run_id: str, request_records: dict[str, float | int])
Stores computed metrics data to the specified monitor instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
monitor_instance_id
|
str
|
The unique ID of the monitor instance. |
required |
run_id
|
str
|
The ID of the monitor run that generated the metrics. |
required |
request_records
|
dict[str | float | int]
|
dict containing the metrics to be published. |
required |
Example
wxgov_client.store_metric_data(
monitor_instance_id="01966801-f9ee-7248-a706-41de00a8a998",
run_id="RUN_ID",
request_records={"context_quality": 0.914, "sensitivity": 0.85},
)
store_record_metric_data #
store_record_metric_data(custom_data_set_id: str, reference_data_set_id: str, computed_on: DataSetType | str, run_id: str, request_records: list[dict])
Stores computed metrics data to the specified transaction record.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
custom_data_set_id
|
str
|
The ID of the custom metric data set. |
required |
reference_data_set_id
|
str
|
The dataset ID on which the metric was calculated. |
required |
computed_on
|
DataSetType
|
The dataset on which the metric was calculated (e.g., payload or feedback). |
required |
run_id
|
str
|
The ID of the monitor run that generated the metrics. |
required |
request_records
|
list[dict]
|
A list of dictionaries containing the records to be stored. |
required |
Example
wxgov_client.store_record_metric_data(
custom_data_set_id="CUSTOM_DATASET_ID",
reference_data_set_id="COMPUTED_ON_DATASET_ID",
computed_on="payload",
run_id="RUN_ID",
request_records=[
{
"reference_record_id": "COMPUTED_ON_RECORD_ID",
"record_timestamp": "2025-12-09T00:00:00Z",
"context_quality": 0.786,
"pii": 0.05,
}
],
)
Credentials#
CloudPakforDataCredentials #
Bases: BaseModel
Encapsulates the credentials required for IBM Cloud Pak for Data.
Attributes:
| Name | Type | Description |
|---|---|---|
url |
str
|
The host URL of the Cloud Pak for Data environment. |
api_key |
str
|
The API key for the environment, if IAM is enabled. |
username |
str
|
The username for the environment. |
password |
str
|
The password for the environment. |
bedrock_url |
str
|
The Bedrock URL. Required only when IAM integration is enabled on CP4D 4.0.x clusters. |
instance_id |
str
|
The instance ID. |
version |
str
|
The version of Cloud Pak for Data. |
disable_ssl_verification |
bool
|
Indicates whether to disable SSL certificate verification.
Defaults to |
IntegratedSystemCredentials #
Bases: BaseModel
Encapsulates the credentials for an Integrated System based on the authentication type.
Depending on the auth_type, only a subset of the properties is required.
Attributes:
| Name | Type | Description |
|---|---|---|
auth_type |
str
|
The type of authentication. Currently supports "basic" and "bearer". |
username |
str
|
The username for Basic Authentication. |
password |
str
|
The password for Basic Authentication. |
token_url |
str
|
The URL of the authentication endpoint used to request a Bearer token. |
token_method |
str
|
The HTTP method (e.g., "POST", "GET") used to request the Bearer token. Defaults to "POST". |
token_headers |
dict
|
Optional headers to include when requesting the Bearer token.
Defaults to |
token_payload |
str | dict
|
The body or payload to send when requesting the Bearer token.
Can be a string (e.g., raw JSON). Defaults to |
Supporting Classes#
WatsonxMetricSpec #
Bases: BaseModel
Defines the IBM watsonx.governance global monitor metric.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The name of the metric. |
applies_to |
list[str]
|
A list of task types that the metric applies to. Currently supports: "summarization", "generation", "question_answering", "extraction", and "retrieval_augmented_generation". |
thresholds |
list[WatsonxMetricThreshold]
|
A list of metric thresholds associated with the metric. |
Example
from beekeeper.observability.watsonx import (
WatsonxMetricSpec,
WatsonxMetricThreshold,
)
WatsonxMetricSpec(
name="context_quality",
applies_to=["retrieval_augmented_generation", "summarization"],
thresholds=[
WatsonxMetricThreshold(threshold_type="lower_limit", default_value=0.75)
],
)
WatsonxMetricThreshold #
Bases: BaseModel
Defines the metric threshold for IBM watsonx.governance.
Attributes:
| Name | Type | Description |
|---|---|---|
threshold_type |
str
|
The threshold type. Can be either |
default_value |
float
|
The metric threshold value. |
Example
from beekeeper.observability.watsonx import WatsonxMetricThreshold
WatsonxMetricThreshold(threshold_type="lower_limit", default_value=0.8)
Enums#
Region #
Bases: str, Enum
Supported IBM watsonx.governance regions.
Defines the available regions where watsonx.governance SaaS services are deployed.
Attributes:
| Name | Type | Description |
|---|---|---|
US_SOUTH |
str
|
"us-south". |
EU_DE |
str
|
"eu-de". |
AU_SYD |
str
|
"au-syd". |
TaskType #
Bases: Enum
Supported IBM watsonx.governance tasks.
Attributes:
| Name | Type | Description |
|---|---|---|
QUESTION_ANSWERING |
str
|
"question_answering" |
SUMMARIZATION |
str
|
"summarization" |
RETRIEVAL_AUGMENTED_GENERATION |
str
|
"retrieval_augmented_generation" |
CLASSIFICATION |
str
|
"classification" |
GENERATION |
str
|
"generation" |
CODE |
str
|
"code" |
EXTRACTION |
str
|
"extraction" |