Skip to content

watsonx.governance

Monitors#

WatsonxPromptMonitor #

Bases: PromptObservability

Provides functionality to interact with IBM watsonx.governance for monitoring prompts executed within IBM watsonx.ai LLMs.

Note

One of the following parameters is required to create a prompt monitor: project_id or space_id, but not both.

Attributes:

Name Type Description
api_key str

The API key for IBM watsonx.governance.

space_id str

The space ID in watsonx.governance.

project_id str

The project ID in watsonx.governance.

region Region

The region where watsonx.governance is hosted when using IBM Cloud. Defaults to us-south.

cpd_creds CloudPakforDataCredentials

The Cloud Pak for Data environment credentials.

subscription_id str

The subscription ID associated with the records being logged.

service_instance_id str

The service instance ID.

Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region

from beekeeper.observability.watsonx import (
    WatsonxPromptMonitor,
    CloudPakforDataCredentials,
)

# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxPromptMonitor(
    api_key="API_KEY", space_id="SPACE_ID", region=Region.US_SOUTH
)

# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
    url="CPD_URL",
    username="USERNAME",
    password="PASSWORD",
    version="5.2",
    instance_id="openshift",
)

wxgov_client = WatsonxPromptMonitor(space_id="SPACE_ID", cpd_creds=cpd_creds)

model_post_init #

model_post_init(__context: Any) -> None

Initialize computed fields after Pydantic validation.

create_prompt_monitor #

create_prompt_monitor(name: str, model_id: str, task_id: TaskType | str, description: str = '', model_parameters: dict | None = None, prompt_template: PromptTemplate | str | None = None, prompt_variables: list[str] | None = None, locale: str = 'en', context_fields: list[str] | None = None, question_field: str | None = None) -> dict

Creates an IBM Prompt Template Asset and ssetup monitor for the given prompt template asset.

Parameters:

Name Type Description Default
name str

The name of the Prompt Template Asset.

required
model_id str

The ID of the model associated with the prompt.

required
task_id TaskType

The task identifier.

required
description str

A description of the Prompt Template Asset.

''
model_parameters dict

A dictionary of model parameters and their respective values.

None
prompt_template PromptTemplate

The prompt template.

None
prompt_variables list[str]

A list of values for prompt input variables.

None
locale str

Locale code for the input/output language. eg. "en", "pt", "es".

'en'
context_fields list[str]

A list of fields that will provide context to the prompt. Applicable only for the retrieval_augmented_generation task type.

None
question_field str

The field containing the question to be answered. Applicable only for the retrieval_augmented_generation task type.

None
Example
from beekeeper.observability.watsonx.supporting_classes.enums import TaskType

wxgov_client.create_prompt_monitor(
    name="IBM prompt template",
    model_id="ibm/granite-3-2b-instruct",
    task_id=TaskType.RETRIEVAL_AUGMENTED_GENERATION,
    prompt_template="You are a helpful AI assistant that provides clear and accurate answers. {context}. Question: {input_query}.",
    prompt_variables=["context", "input_query"],
    context_fields=["context"],
    question_field="input_query",
)

store_payload_records #

store_payload_records(request_records: list[dict], subscription_id: str | None = None) -> list[str]

Stores records to the payload logging system.

Parameters:

Name Type Description Default
request_records list[dict]

A list of records to be logged. Each record is represented as a dictionary.

required
subscription_id str

The subscription ID associated with the records being logged.

None
Example
wxgov_client.store_payload_records(
    request_records=[
        {
            "context1": "value_context1",
            "context2": "value_context2",
            "input_query": "What's Beekeeper Framework?",
            "generated_text": "Beekeeper is a data framework to make AI easier to work with.",
            "input_token_count": 25,
            "generated_token_count": 150,
        }
    ],
    subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)

store_feedback_records #

store_feedback_records(request_records: list[dict], subscription_id: str | None = None) -> dict

Stores records to the feedback logging system.

Info
  • For prompt monitors created using Beekeeper, the label field is reference_output.

Parameters:

Name Type Description Default
request_records list[dict]

A list of records to be logged, where each record is represented as a dictionary.

required
subscription_id str

The subscription ID associated with the records being logged.

None
Example
wxgov_client.store_feedback_records(
    request_records=[
        {
            "context1": "value_context1",
            "context2": "value_context2",
            "input_query": "What's Beekeeper Framework?",
            "reference_output": "Beekeeper is a data framework to make AI easier to work with."
            "generated_text": "Beekeeper is a data framework to make AI easier to work with.",
        }
    ],
    subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)

WatsonxExternalPromptMonitor #

Bases: PromptObservability

Provides functionality to interact with IBM watsonx.governance for monitoring prompts executed on external LLMs.

Note

One of the following parameters is required to create a prompt monitor: project_id or space_id, but not both.

Attributes:

Name Type Description
api_key str

The API key for IBM watsonx.governance.

space_id str

The space ID in watsonx.governance.

project_id str

The project ID in watsonx.governance.

region Region

The region where watsonx.governance is hosted when using IBM Cloud. Defaults to us-south.

cpd_creds CloudPakforDataCredentials

The Cloud Pak for Data environment credentials.

subscription_id str

The subscription ID associated with the records being logged.

service_instance_id str

The service instance ID.

Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region

from beekeeper.observability.watsonx import (
    WatsonxExternalPromptMonitor,
    CloudPakforDataCredentials,
)

# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxExternalPromptMonitor(
    api_key="API_KEY", space_id="SPACE_ID", region=Region.US_SOUTH
)

# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
    url="CPD_URL",
    username="USERNAME",
    password="PASSWORD",
    version="5.2",
    instance_id="openshift",
)

wxgov_client = WatsonxExternalPromptMonitor(
    space_id="SPACE_ID", cpd_creds=cpd_creds
)

model_post_init #

model_post_init(__context: Any) -> None

Initialize computed fields after Pydantic validation.

create_prompt_monitor #

create_prompt_monitor(name: str, model_id: str, task_id: TaskType | str, detached_model_provider: str, description: str = '', model_parameters: dict | None = None, detached_model_name: str | None = None, detached_model_url: str | None = None, detached_prompt_url: str | None = None, detached_prompt_additional_info: dict | None = None, prompt_template: PromptTemplate | str | None = None, prompt_variables: list[str] | None = None, locale: str = 'en', context_fields: list[str] | None = None, question_field: str | None = None) -> dict

Creates a detached (external) prompt template asset and attaches a monitor to the specified prompt template asset.

Parameters:

Name Type Description Default
name str

The name of the External Prompt Template Asset.

required
model_id str

The ID of the model associated with the prompt.

required
task_id TaskType

The task identifier.

required
detached_model_provider str

The external model provider.

required
description str

A description of the External Prompt Template Asset.

''
model_parameters dict

Model parameters and their respective values.

None
detached_model_name str

The name of the external model.

None
detached_model_url str

The URL of the external model.

None
detached_prompt_url str

The URL of the external prompt.

None
detached_prompt_additional_info dict

Additional information related to the external prompt.

None
prompt_template PromptTemplate

The prompt template.

None
prompt_variables list[str]

Values for the prompt variables.

None
locale str

Locale code for the input/output language. eg. "en", "pt", "es".

'en'
context_fields list[str]

A list of fields that will provide context to the prompt. Applicable only for "retrieval_augmented_generation" task type.

None
question_field str

The field containing the question to be answered. Applicable only for "retrieval_augmented_generation" task type.

None
Example
from beekeeper.observability.watsonx.supporting_classes.enums import TaskType

wxgov_client.create_prompt_monitor(
    name="Detached prompt (model AWS Anthropic)",
    model_id="anthropic.claude-v2",
    task_id=TaskType.RETRIEVAL_AUGMENTED_GENERATION,
    detached_model_provider="AWS Bedrock",
    detached_model_name="Anthropic Claude 2.0",
    detached_model_url="https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html",
    prompt_template="You are a helpful AI assistant that provides clear and accurate answers. {context}. Question: {input_query}.",
    prompt_variables=["context", "input_query"],
    context_fields=["context"],
    question_field="input_query",
)

store_payload_records #

store_payload_records(request_records: list[dict], subscription_id: str | None = None) -> list[str]

Stores records to the payload logging system.

Parameters:

Name Type Description Default
request_records list[dict]

A list of records to be logged, where each record is represented as a dictionary.

required
subscription_id str

The subscription ID associated with the records being logged.

None
Example
wxgov_client.store_payload_records(
    request_records=[
        {
            "context1": "value_context1",
            "context2": "value_context2",
            "input_query": "What's Beekeeper Framework?",
            "generated_text": "Beekeeper is a data framework to make AI easier to work with.",
            "input_token_count": 25,
            "generated_token_count": 150,
        }
    ],
    subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)

store_feedback_records #

store_feedback_records(request_records: list[dict], subscription_id: str | None = None) -> dict

Stores records to the feedback logging system.

Info
  • Feedback data for external prompt must include the model output named generated_text.
  • For prompt monitors created using Beekeeper, the label field is reference_output.

Parameters:

Name Type Description Default
request_records list[dict]

A list of records to be logged, where each record is represented as a dictionary.

required
subscription_id str

The subscription ID associated with the records being logged.

None
Example
wxgov_client.store_feedback_records(
    request_records=[
        {
            "context1": "value_context1",
            "context2": "value_context2",
            "input_query": "What's Beekeeper Framework?",
            "reference_output": "Beekeeper is a data framework to make AI easier to work with."
            "generated_text": "Beekeeper is a data framework to make AI easier to work with.",
        }
    ],
    subscription_id="5d62977c-a53d-4b6d-bda1-7b79b3b9d1a0",
)

Custom Metrics#

WatsonxCustomMetricsManager #

Bases: BaseModel

Provides functionality to set up a custom metric to measure your model's performance with IBM watsonx.governance.

Attributes:

Name Type Description
api_key str

The API key for IBM watsonx.governance.

region Region

The region where watsonx.governance is hosted when using IBM Cloud. Defaults to us-south.

cpd_creds CloudPakforDataCredentials

IBM Cloud Pak for Data environment credentials.

service_instance_id str

The service instance ID.

Example
from beekeeper.observability.watsonx.supporting_classes.enums import Region

from beekeeper.observability.watsonx import (
    WatsonxCustomMetricsManager,
    CloudPakforDataCredentials,
)

# watsonx.governance (IBM Cloud)
wxgov_client = WatsonxCustomMetricsManager(
    api_key="API_KEY", region=Region.US_SOUTH
)

# watsonx.governance (CP4D)
cpd_creds = CloudPakforDataCredentials(
    url="CPD_URL",
    username="USERNAME",
    password="PASSWORD",
    version="5.2",
    instance_id="openshift",
)

wxgov_client = WatsonxCustomMetricsManager(cpd_creds=cpd_creds)

associate_monitor_instance #

associate_monitor_instance(integrated_system_id: str, monitor_definition_id: str, subscription_id: str)

Associate the specified monitor definition to the specified subscription.

Parameters:

Name Type Description Default
integrated_system_id str

The ID of the integrated system.

required
monitor_definition_id str

The ID of the custom metric monitor instance.

required
subscription_id str

The ID of the subscription to associate the monitor with.

required
Example
wxgov_client.associate_monitor_instance(
    integrated_system_id="019667ca-5687-7838-8d29-4ff70c2b36b0",
    monitor_definition_id="custom_llm_quality",
    subscription_id="0195e95d-03a4-7000-b954-b607db10fe9e",
)

create_metric_definition #

create_metric_definition(name: str, metrics: list[WatsonxMetricSpec], integrated_system_url: str, integrated_system_credentials: IntegratedSystemCredentials, schedule: bool = False) -> dict[str, Any]

Creates a custom metric definition for IBM watsonx.governance.

This must be done before using custom metrics.

Parameters:

Name Type Description Default
name str

The name of the custom metric group.

required
metrics list[WatsonxMetricSpec]

A list of metrics to be measured.

required
schedule bool

Enable or disable the scheduler. Defaults to False.

False
integrated_system_url str

The URL of the external metric provider.

required
integrated_system_credentials IntegratedSystemCredentials

The credentials for the integrated system.

required
Example
from beekeeper.observability.watsonx import (
    WatsonxMetricSpec,
    IntegratedSystemCredentials,
    WatsonxMetricThreshold,
)

wxgov_client.create_metric_definition(
    name="Custom Metric - Custom LLM Quality",
    metrics=[
        WatsonxMetricSpec(
            name="context_quality",
            applies_to=[
                "retrieval_augmented_generation",
                "summarization",
            ],
            thresholds=[
                WatsonxMetricThreshold(
                    threshold_type="lower_limit", default_value=0.75
                )
            ],
        )
    ],
    integrated_system_url="IS_URL",  # URL to the endpoint computing the metric
    integrated_system_credentials=IntegratedSystemCredentials(
        auth_type="basic", username="USERNAME", password="PASSWORD"
    ),
)

model_post_init #

model_post_init(__context: Any) -> None

Initialize computed fields after Pydantic validation.

store_metric_data #

store_metric_data(monitor_instance_id: str, run_id: str, request_records: dict[str, float | int])

Stores computed metrics data to the specified monitor instance.

Parameters:

Name Type Description Default
monitor_instance_id str

The unique ID of the monitor instance.

required
run_id str

The ID of the monitor run that generated the metrics.

required
request_records dict[str | float | int]

dict containing the metrics to be published.

required
Example
wxgov_client.store_metric_data(
    monitor_instance_id="01966801-f9ee-7248-a706-41de00a8a998",
    run_id="RUN_ID",
    request_records={"context_quality": 0.914, "sensitivity": 0.85},
)

store_record_metric_data #

store_record_metric_data(custom_data_set_id: str, reference_data_set_id: str, computed_on: DataSetType | str, run_id: str, request_records: list[dict])

Stores computed metrics data to the specified transaction record.

Parameters:

Name Type Description Default
custom_data_set_id str

The ID of the custom metric data set.

required
reference_data_set_id str

The dataset ID on which the metric was calculated.

required
computed_on DataSetType

The dataset on which the metric was calculated (e.g., payload or feedback).

required
run_id str

The ID of the monitor run that generated the metrics.

required
request_records list[dict]

A list of dictionaries containing the records to be stored.

required
Example
wxgov_client.store_record_metric_data(
    custom_data_set_id="CUSTOM_DATASET_ID",
    reference_data_set_id="COMPUTED_ON_DATASET_ID",
    computed_on="payload",
    run_id="RUN_ID",
    request_records=[
        {
            "reference_record_id": "COMPUTED_ON_RECORD_ID",
            "record_timestamp": "2025-12-09T00:00:00Z",
            "context_quality": 0.786,
            "pii": 0.05,
        }
    ],
)

Credentials#

CloudPakforDataCredentials #

Bases: BaseModel

Encapsulates the credentials required for IBM Cloud Pak for Data.

Attributes:

Name Type Description
url str

The host URL of the Cloud Pak for Data environment.

api_key str

The API key for the environment, if IAM is enabled.

username str

The username for the environment.

password str

The password for the environment.

bedrock_url str

The Bedrock URL. Required only when IAM integration is enabled on CP4D 4.0.x clusters.

instance_id str

The instance ID.

version str

The version of Cloud Pak for Data.

disable_ssl_verification bool

Indicates whether to disable SSL certificate verification. Defaults to True.

IntegratedSystemCredentials #

Bases: BaseModel

Encapsulates the credentials for an Integrated System based on the authentication type.

Depending on the auth_type, only a subset of the properties is required.

Attributes:

Name Type Description
auth_type str

The type of authentication. Currently supports "basic" and "bearer".

username str

The username for Basic Authentication.

password str

The password for Basic Authentication.

token_url str

The URL of the authentication endpoint used to request a Bearer token.

token_method str

The HTTP method (e.g., "POST", "GET") used to request the Bearer token. Defaults to "POST".

token_headers dict

Optional headers to include when requesting the Bearer token. Defaults to None.

token_payload str | dict

The body or payload to send when requesting the Bearer token. Can be a string (e.g., raw JSON). Defaults to None.

Supporting Classes#

WatsonxMetricSpec #

Bases: BaseModel

Defines the IBM watsonx.governance global monitor metric.

Attributes:

Name Type Description
name str

The name of the metric.

applies_to list[str]

A list of task types that the metric applies to. Currently supports: "summarization", "generation", "question_answering", "extraction", and "retrieval_augmented_generation".

thresholds list[WatsonxMetricThreshold]

A list of metric thresholds associated with the metric.

Example
from beekeeper.observability.watsonx import (
    WatsonxMetricSpec,
    WatsonxMetricThreshold,
)

WatsonxMetricSpec(
    name="context_quality",
    applies_to=["retrieval_augmented_generation", "summarization"],
    thresholds=[
        WatsonxMetricThreshold(threshold_type="lower_limit", default_value=0.75)
    ],
)

WatsonxMetricThreshold #

Bases: BaseModel

Defines the metric threshold for IBM watsonx.governance.

Attributes:

Name Type Description
threshold_type str

The threshold type. Can be either lower_limit or upper_limit.

default_value float

The metric threshold value.

Example
from beekeeper.observability.watsonx import WatsonxMetricThreshold

WatsonxMetricThreshold(threshold_type="lower_limit", default_value=0.8)

Enums#

Region #

Bases: str, Enum

Supported IBM watsonx.governance regions.

Defines the available regions where watsonx.governance SaaS services are deployed.

Attributes:

Name Type Description
US_SOUTH str

"us-south".

EU_DE str

"eu-de".

AU_SYD str

"au-syd".

TaskType #

Bases: Enum

Supported IBM watsonx.governance tasks.

Attributes:

Name Type Description
QUESTION_ANSWERING str

"question_answering"

SUMMARIZATION str

"summarization"

RETRIEVAL_AUGMENTED_GENERATION str

"retrieval_augmented_generation"

CLASSIFICATION str

"classification"

GENERATION str

"generation"

CODE str

"code"

EXTRACTION str

"extraction"