Scrapegraph-ai icon indicating copy to clipboard operation
Scrapegraph-ai copied to clipboard

Input to PromptTemplate is missing variables - while trying out examples

Open raiden076 opened this issue 11 months ago • 9 comments

version==1.40.1

Describe the bug i am trying to execute the SmartScraperGraph with schema defined and recieving these errors

Error during chain execution: 'Input to PromptTemplate is missing variables {'"properties"', '"foo"', '"$defs"'}. Expected: ['"$defs"', '"foo"', '"properties"', 'question'] Received: ['question']\nNote: if you intended {"properties"} to be part of the string and not a variable, please escape it with double curly braces like: '{{"properties"}}'.\nFor troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/INVALID_PROMPT_INPUT ' Traceback (most recent call last):

To Reproduce Steps to reproduce the behavior: follow this example https://docs-oss.scrapegraphai.com/docs/Examples/Groq/smart_scraper_schema_groq

Expected behavior expected to recieve data as given pydantic model

detailed error

.venvarkaprav0@Book:~/salesup/icp-filter$ /home/arkaprav0/salesup/icp-filter/.venv/bin/python /home/arkaprav0/salesup/icp-filter/app-test.py
Max input tokens for model groq/llama-3.2-3b-preview not found,
                    please specify the model_tokens parameter in the llm section of the graph configuration.
                    Using default token size: 8192
Error during chain execution: 'Input to PromptTemplate is missing variables {\'"properties"\', \'"foo"\', \'"$defs"\'}.  Expected: [\'"$defs"\', \'"foo"\', \'"properties"\', \'question\'] Received: [\'question\']\nNote: if you intended {"properties"} to be part of the string and not a variable, please escape it with double curly braces like: \'{{"properties"}}\'.\nFor troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/INVALID_PROMPT_INPUT '
Traceback (most recent call last):
  File "/home/arkaprav0/salesup/icp-filter/app-test.py", line 61, in <module>
    result = smart_scraper_graph.run()
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 296, in run
    self.final_state, self.execution_info = self.graph.execute(inputs)
                                            ~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/graphs/base_graph.py", line 358, in execute
    return self._execute_standard(initial_state)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/graphs/base_graph.py", line 303, in _execute_standard
    raise e
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/graphs/base_graph.py", line 276, in _execute_standard
    result, node_exec_time, cb_data = self._execute_node(
                                      ~~~~~~~~~~~~~~~~~~^
        current_node, state, llm_model, llm_model_name
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/graphs/base_graph.py", line 200, in _execute_node
    result = current_node.execute(state)
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 209, in execute
    answer = self.invoke_with_timeout(
        chain, {"question": user_prompt}, self.timeout
    )
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 79, in invoke_with_timeout
    response = chain.invoke(inputs)
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/runnables/base.py", line 3022, in invoke
    input = context.run(step.invoke, input, config, **kwargs)
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/prompts/base.py", line 210, in invoke
    return self._call_with_config(
           ~~~~~~~~~~~~~~~~~~~~~~^
        self._format_prompt_with_error_handling,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
        serialized=self._serialized,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/runnables/base.py", line 1922, in _call_with_config
    context.run(
    ~~~~~~~~~~~^
        call_func_with_variable_args,  # type: ignore[arg-type]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
        **kwargs,
        ^^^^^^^^^
    ),
    ^
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/runnables/config.py", line 396, in call_func_with_variable_args
    return func(input, **kwargs)  # type: ignore[call-arg]
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/prompts/base.py", line 184, in _format_prompt_with_error_handling
    _inner_input = self._validate_input(inner_input)
  File "/home/arkaprav0/salesup/icp-filter/.venv/lib/python3.13/site-packages/langchain_core/prompts/base.py", line 178, in _validate_input
    raise KeyError(
        create_message(message=msg, error_code=ErrorCode.INVALID_PROMPT_INPUT)
    )
KeyError: 'Input to PromptTemplate is missing variables {\'"properties"\', \'"foo"\', \'"$defs"\'}.  Expected: [\'"$defs"\', \'"foo"\', \'"properties"\', \'question\'] Received: [\'question\']\nNote: if you intended {"properties"} to be part of the string and not a variable, please escape it with double curly braces like: \'{{"properties"}}\'.\nFor troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/INVALID_PROMPT_INPUT '
.venvarkaprav0@Book:~/salesup/icp-filter$ 

code i am using

"""
Basic example of scraping pipeline using SmartScraper with schema
"""
import os
from typing import List
from pydantic import BaseModel, Field
from dotenv import load_dotenv
from scrapegraphai.graphs import SmartScraperGraph
from scrapegraphai.utils import prettify_exec_info

load_dotenv()

# ************************************************
# Define the output schema for the graph
# ************************************************

class Project(BaseModel):
    title: str = Field(description="The title of the project")
    description: str = Field(description="The description of the project")

class Projects(BaseModel):
    projects: List[Project]

# ************************************************
# Define the configuration for the graph
# ************************************************

groq_key = os.getenv("GROQ_APIKEY")

graph_config = {
    "llm": {
        "model": "groq/llama-3.2-3b-preview",
        "api_key": groq_key,
        "temperature": 0
    },
        "loader_kwargs": {
        # https://github.com/microsoft/playwright/issues/14023
    "args": ["--disable-gpu", "--disable-dev-shm-usage"],
    },
    "headless": False
}

# ************************************************
# Create the SmartScraperGraph instance and run it
# ************************************************

smart_scraper_graph = SmartScraperGraph(
    prompt="List me all the projects with their description.",
    # also accepts a string with the already downloaded HTML code
    source="https://perinim.github.io/projects/",
    schema=Projects,
    config=graph_config
)

result = smart_scraper_graph.run()
print(result)

# ************************************************
# Get graph execution info
# ************************************************

graph_exec_info = smart_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))

raiden076 avatar Feb 27 '25 16:02 raiden076

Hey there, @raiden076! 👋 I'm here to assist with any bugs, questions, or contributions. If you need help, feel free to ask!

I found a similar open issue regarding the "Input to PromptTemplate is missing variables" error. The suggested solution is to ensure that all required input keys are present in the state dictionary before executing the node. Specifically, make sure that the variable names in your prompt dictionary match exactly with what the PromptTemplate expects. In your case, it should include "properties", "foo", and "$defs" along with question [1].

Here's a checklist to help troubleshoot the issue:

  1. Check Variable Names: Ensure that the variable names in your prompt dictionary match exactly with what the PromptTemplate expects.

  2. Correct Template Usage: Make sure that the PromptTemplate is being used correctly in your code. The template should have placeholders for all expected variables.

  3. State Dictionary: Verify that your state dictionary includes all necessary keys and their corresponding values before the PromptTemplate is executed.

  4. Escaping Variables: If any of the missing variables are intended to be part of the string rather than variables, escape them with double curly braces (e.g., {{"properties"}}).

  5. Library Version: Ensure that there are no other conflicting dependencies or configurations in your environment.

If the issue persists, double-check the configuration and the way the PromptTemplate is being initialized and used in your code. Comparing your implementation with a working example from the documentation might also be helpful.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Feb 27 '25 17:02 dosubot[bot]

seems scrapegraphai==1.40.1 fix this issue

redliu312 avatar Feb 28 '25 00:02 redliu312

seems scrapegraphai==1.40.1 fix this issue

Nope, once indicate a schema the error will be throwed by langchain

gray19950103 avatar Feb 28 '25 08:02 gray19950103

IMPORTANT: The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]} the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:

{"properties": {"title": {"title": "Title", "type": "string"}, "price": {"title": "Price", "type": "integer"}, "year": {"title": "Year", "type": "integer"}, "mileage": {"title": "Mileage", "type": "integer"}, "transfers": {"title": "Transfers", "type": "integer"}, "brand": {"title": "Brand", "type": "string"}, "series": {"title": "Series", "type": "string"}}, "required": ["title", "price", "year", "mileage", "transfers", "brand", "series"]}

scrapegraphai will add this into template, seems leading to langchain looking for input_variables of properties and foo which won't be provided

gray19950103 avatar Feb 28 '25 08:02 gray19950103

Does it work now?

VinciGit00 avatar Mar 01 '25 20:03 VinciGit00

Does it work now?

but i am on the latest version, 1.40.1. and yes, i just checked again, the issue is still here @VinciGit00

raiden076 avatar Mar 02 '25 09:03 raiden076

any update here? @VinciGit00 👀

raiden076 avatar Mar 03 '25 15:03 raiden076

I could reproduce this issue with today's release: v1.42.1.

alvin-sunco avatar Mar 12 '25 16:03 alvin-sunco

Issue exists in 1.43.0 as well when providing a schema the error is thrown.

--- Executing GenerateAnswer Node ---
Error during chain execution: 'Input to PromptTemplate is missing variables {\'"properties"\', \'"$defs"\', \'"foo"\'}.  Expected: [\'"$defs"\', \'"foo"\', \'"properties"\', \'question\'] Received: [\'question\']\nNote: if you intended {"properties"} to be part of the string and not a variable, please escape it with double curly braces like: \'{{"properties"}}\'.\nFor troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/INVALID_PROMPT_INPUT '
Traceback (most recent call last):
  File "/Users/davidguthu/Projects/scraper-research/scrapegraphai/demo.py", line 106, in <module>
    result = smart_scraper_graph.run()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 296, in run
    self.final_state, self.execution_info = self.graph.execute(inputs)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/graphs/base_graph.py", line 358, in execute
    return self._execute_standard(initial_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/graphs/base_graph.py", line 303, in _execute_standard
    raise e
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/graphs/base_graph.py", line 276, in _execute_standard
    result, node_exec_time, cb_data = self._execute_node(
                                      ^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/graphs/base_graph.py", line 200, in _execute_node
    result = current_node.execute(state)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 209, in execute
    answer = self.invoke_with_timeout(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 79, in invoke_with_timeout
    response = chain.invoke(inputs)
               ^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 3027, in invoke
    input = context.run(step.invoke, input, config, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/prompts/base.py", line 210, in invoke
    return self._call_with_config(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 1927, in _call_with_config
    context.run(
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/runnables/config.py", line 396, in call_func_with_variable_args
    return func(input, **kwargs)  # type: ignore[call-arg]
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/prompts/base.py", line 184, in _format_prompt_with_error_handling
    _inner_input = self._validate_input(inner_input)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/davidguthu/.local/share/virtualenvs/scrapegraphai-uKyFzAAI/lib/python3.12/site-packages/langchain_core/prompts/base.py", line 178, in _validate_input
    raise KeyError(
KeyError: 'Input to PromptTemplate is missing variables {\'"properties"\', \'"$defs"\', \'"foo"\'}.  Expected: [\'"$defs"\', \'"foo"\', \'"properties"\', \'question\'] Received: [\'question\']\nNote: if you intended {"properties"} to be part of the string and not a variable, please escape it with double curly braces like: \'{{"properties"}}\'.\nFor troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/INVALID_PROMPT_INPUT '

davidguthu avatar Mar 15 '25 17:03 davidguthu

Hi, @raiden076. I'm Dosu, and I'm helping the Scrapegraph-ai team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The issue involves an error with SmartScraperGraph in version 1.40.1 when using a defined schema.
  • Missing expected variables in PromptTemplate, such as '"properties"', '"foo"', and '"$defs"'.
  • You followed documentation examples, but the error persists.
  • @redliu312 thought it was resolved in version 1.40.1, but @gray19950103 confirmed the error still occurs.
  • @alvin-sunco and @davidguthu confirmed the issue persists in versions 1.42.1 and 1.43.0.

Next Steps:

  • Please let me know if this issue is still relevant to the latest version of the Scrapegraph-ai repository by commenting here.
  • If no updates are provided, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Jun 14 '25 16:06 dosubot[bot]