Verba icon indicating copy to clipboard operation
Verba copied to clipboard

Firecrawl URL Importing is throwing error

Open ukiras123 opened this issue 1 year ago • 7 comments

Description

Installation

  • [x] pip install goldenverba
  • [ ] pip install from source
  • [ ] Docker installation

If you installed via pip, please specify the version:

Weaviate Deployment

  • [x] Local Deployment
  • [ ] Docker Deployment
  • [x] Cloud Deployment

Configuration

Reader: Chunker: Embedder: Retriever: Generator:

Steps to Reproduce

Try to import using Firecrawl URL. It is throwing error: ℹ FileStatus.ERROR | 2024-09-10T04:36:25.979Z | Import for New Firecrawl Job failed: Reader Firecrawl failed with: 1 validation error for FileConfig metadata Field required [type=missing, input_value={'fileID': '2024-09-10T04... settings', 'took': 0}}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing | 0

Additional context

It is not working with any website I tried. Firecrawl has also updated its version so we might want to do that as well.

ukiras123 avatar Sep 10 '24 04:09 ukiras123

ℹ FileStatus.ERROR | 2024-09-10T04:36:25.979Z | Import for New Firecrawl Job failed: Reader Firecrawl failed with: 1 validation error for FileConfig metadata Field required [type=missing, input_value={'fileID': '2024-09-10T04... settings', 'took': 0}}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing | 0

ukiras123 avatar Sep 10 '24 04:09 ukiras123

Thanks for the issue! We'll look into it

thomashacker avatar Sep 13 '24 03:09 thomashacker

Same issue while trying to import github repo that contains python code.

narmaku avatar Nov 18 '24 02:11 narmaku

Thanks a lot, there was a bug in the Firecrawl Reader code, should be fixed in the upcoming version 2.1

thomashacker avatar Dec 09 '24 11:12 thomashacker

Version 2.1 got released, let me know if the error still persists and feel free to reopen

thomashacker avatar Dec 10 '24 10:12 thomashacker

Version 2.1 got released, let me know if the error still persists and feel free to reopen

Getting this on 2.1.2 trying to import a GitHub repo (using this repo as an example).

ℹ Found existing Client
ℹ FileStatus.STARTING | 2025-02-18T07:57:31.247Z | Starting Import |
0
ℹ Fetched 240 document paths from
https://api.github.com/repos/weaviate/Verba/git/trees/main?recursive=1
ℹ FileStatus.ERROR | 2025-02-18T07:57:31.247Z | Import for New Git Job
failed: Reader Git failed with: Couldn't load retrieve
.github/ISSUE_TEMPLATE/verba-feature-template.md: 1 validation error for
FileConfig metadata   Field required [type=missing, input_value={'fileID':
'2025-02-18T07... detected', 'took': 0}}}, input_type=dict]     For further
information visit https://errors.pydantic.dev/2.10/v/missing | 0

hskiba avatar Feb 18 '25 08:02 hskiba

Thanks for the report, I'll look into it!

thomashacker avatar Feb 18 '25 12:02 thomashacker