-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Description
Describe the bug
OmniScraperGraph throws error. Tested on minimal example on GitHub.
omni_scraper_openai.py
To Reproduce
mkdir test
cd test
python3 -m venv venv
source venv/bin/activate
pip install scrapegraphai \
"scrapegraphai[burr]" \
"scrapegraphai[more-browser-options]" \
"pip install scrapegraphai[other-language-models]" \
langchain_google_vertexai --no-cache # to have a clean environment
# It do not start without all of this libraries. This is potentially a bug itself
playwright install
# Using the provided example for openai found on GitHub
# Set up openai key in .env
python3 omni_scraper_openai.py
Output
--- Executing Fetch Node ---
--- (Fetching HTML from: https://perinim.github.io/projects/) ---
--- Executing Parse Node ---
--- Executing ImageToText Node ---
Traceback (most recent call last):
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/nodes/base_node.py", line 112, in get_input_keys
input_keys = self._parse_input_keys(state, self.input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/nodes/base_node.py", line 236, in _parse_input_keys
raise ValueError("No state keys matched the expression.")
ValueError: No state keys matched the expression.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/lollo/Desktop/test/test.py", line 42, in <module>
result = omni_scraper_graph.run()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/omni_scraper_graph.py", line 124, in run
self.final_state, self.execution_info = self.graph.execute(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/base_graph.py", line 263, in execute
return self._execute_standard(initial_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/base_graph.py", line 185, in _execute_standard
raise e
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/base_graph.py", line 169, in _execute_standard
result = current_node.execute(state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/nodes/image_to_text_node.py", line 54, in execute
input_keys = self.get_input_keys(state)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/nodes/base_node.py", line 116, in get_input_keys
raise ValueError(f"Error parsing input keys for {self.node_name}: {str(e)}")
ValueError: Error parsing input keys for ImageToText: No state keys matched the expression.
Adding burr arguments to graph config:
"burr_kwargs": {
"project_name": "test-scraper",
"app_instance_id":"1234",
}
Starting action: Fetch
--- Executing Fetch Node ---
--- (Fetching HTML from: https://perinim.github.io/projects/) ---
********************************************************************************
-------------------------------------------------------------------
Oh no an error! Need help with Burr?
Join our discord and ask for help! https://discord.gg/4FxBMyzW5n
-------------------------------------------------------------------
> Action: `Fetch` encountered an error!<
> State (at time of action):
{'__SEQUENCE_ID': 0,
'url': 'https://perinim.github.io/projects/',
'user_prompt': "'List me all the projects with their titles and im..."}
> Inputs (at time of action):
{}
********************************************************************************
Traceback (most recent call last):
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 561, in _step
new_state = _run_reducer(next_action, self._state, result, next_action.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 199, in _run_reducer
_validate_reducer_writes(reducer, new_state, name)
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 174, in _validate_reducer_writes
raise ValueError(
ValueError: State is missing write keys after running: Fetch. Missing keys are: {'link_urls', 'img_urls'}. Has writes: ['doc', 'link_urls', 'img_urls']
Finishing action: Fetch
Traceback (most recent call last):
File "/Users/lollo/Desktop/test/test.py", line 46, in <module>
result = omni_scraper_graph.run()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/omni_scraper_graph.py", line 124, in run
self.final_state, self.execution_info = self.graph.execute(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/graphs/base_graph.py", line 260, in execute
result = bridge.execute(initial_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/scrapegraphai/integrations/burr_bridge.py", line 215, in execute
last_action, result, final_state = self.burr_app.run(
^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/telemetry.py", line 273, in wrapped_fn
return call_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 893, in run
next(gen)
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 838, in iterate
prior_action, result, state = self.step(inputs=inputs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 515, in step
out = self._step(inputs=inputs, _run_hooks=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 568, in _step
raise e
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 561, in _step
new_state = _run_reducer(next_action, self._state, result, next_action.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 199, in _run_reducer
_validate_reducer_writes(reducer, new_state, name)
File "/Users/lollo/Desktop/test/venv/lib/python3.11/site-packages/burr/core/application.py", line 174, in _validate_reducer_writes
raise ValueError(
ValueError: State is missing write keys after running: Fetch. Missing keys are: {'link_urls', 'img_urls'}. Has writes: ['doc', 'link_urls', 'img_urls']
Desktop
- OS: MacOsX (Intel)
- ScrapeGraphAi Version: 1.14.0 / 1.14.1 / 1.15.0 / 1.13.3 (Tested on all)
- python3.11.9 (not python3.12, see Package
google-crc32c
does not support Python 12 #568 )
djpecot
Metadata
Metadata
Assignees
Labels
No labels