Skip to content

json_schema evaluator docs are misleading #681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mshavliuk opened this issue Feb 17, 2025 · 0 comments
Open

json_schema evaluator docs are misleading #681

mshavliuk opened this issue Feb 17, 2025 · 0 comments
Assignees

Comments

@mshavliuk
Copy link

mshavliuk commented Feb 17, 2025

Description

Following the instructions in docs/reference/sdk_reference/langchain_evaluators.mdx I tried to create 'json_schema' evaluator:

from langsmith.evaluation import LangChainStringEvaluator

LangChainStringEvaluator("json_schema")

And got the following error:

Traceback (most recent call last):
  File "/Users/user/PycharmProjects/ml-sandbox/src/bug.py", line 3, in <module>
    LangChainStringEvaluator("json_schema")
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/evaluation/integrations/_langchain.py", line 165, in __init__
    self.evaluator = load_evaluator(evaluator, **(config or {}))  # type: ignore[assignment, arg-type]
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langchain/evaluation/loading.py", line 127, in load_evaluator
    raise ValueError(
ValueError: Unknown evaluator type: json_schema
Valid types are: [<EvaluatorType.QA: 'qa'>, <EvaluatorType.COT_QA: 'cot_qa'>, <EvaluatorType.CONTEXT_QA: 'context_qa'>, <EvaluatorType.PAIRWISE_STRING: 'pairwise_string'>, <EvaluatorType.SCORE_STRING: 'score_string'>, <EvaluatorType.LABELED_PAIRWISE_STRING: 'labeled_pairwise_string'>, <EvaluatorType.LABELED_SCORE_STRING: 'labeled_score_string'>, <EvaluatorType.AGENT_TRAJECTORY: 'trajectory'>, <EvaluatorType.CRITERIA: 'criteria'>, <EvaluatorType.LABELED_CRITERIA: 'labeled_criteria'>, <EvaluatorType.STRING_DISTANCE: 'string_distance'>, <EvaluatorType.PAIRWISE_STRING_DISTANCE: 'pairwise_string_distance'>, <EvaluatorType.EMBEDDING_DISTANCE: 'embedding_distance'>, <EvaluatorType.PAIRWISE_EMBEDDING_DISTANCE: 'pairwise_embedding_distance'>, <EvaluatorType.JSON_VALIDITY: 'json_validity'>, <EvaluatorType.JSON_EQUALITY: 'json_equality'>, <EvaluatorType.JSON_EDIT_DISTANCE: 'json_edit_distance'>, <EvaluatorType.JSON_SCHEMA_VALIDATION: 'json_schema_validation'>, <EvaluatorType.REGEX_MATCH: 'regex_match'>, <EvaluatorType.EXACT_MATCH: 'exact_match'>]

Partial solution
Looking at the error message I found that it includes 'json_schema_validation', which looks as a correct candidate, so I used the following code:


class SummaryResponse(BaseModel):
    summary: str
    reason: str


json_schema = LangChainStringEvaluator(
    "json_schema_validation",
    config={
        "schema": SummaryResponse.model_json_schema()
    }
)
experiment_results = langsmith_client.evaluate(
        simple,
        data="dataset_name",
        evaluators=[json_schema],
    )

However, this led to another error:

Error running evaluator <DynamicRunEvaluator evaluate> on run 8ba9cf6c-90bf-4c51-ad37-e2ab01290413: ValueError('JsonSchemaEvaluator requires a reference string.')
Traceback (most recent call last):
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/evaluation/_runner.py", line 1634, in _run_evaluators
    evaluator_response = evaluator.evaluate_run(
                         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/evaluation/evaluator.py", line 331, in evaluate_run
    result = self.func(
             ^^^^^^^^^^
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 629, in wrapper
    raise e
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 626, in wrapper
    function_result = run_container["context"].run(func, *args, **kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langsmith/evaluation/integrations/_langchain.py", line 260, in evaluate
    results = self.evaluator.evaluate_strings(**eval_inputs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langchain/evaluation/schema.py", line 219, in evaluate_strings
    self._check_evaluation_args(reference=reference, input=input)
  File "/Users/user/PycharmProjects/ml-sandbox/.venv/lib/python3.12/site-packages/langchain/evaluation/schema.py", line 127, in _check_evaluation_args
    raise ValueError(f"{self.__class__.__name__} requires a reference string.")
ValueError: JsonSchemaEvaluator requires a reference string.

Looking deeper at the implementation of LangChainStringEvaluator.evaluate I can see that the second argument given to the StringEvaluator is an example field, taken from the dataset, which is None in my case. Instead, I expect it to use the given JSON schema, but I didn't manage to figure out how to make it work.

Context
LangSmith version: 0.3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants