crawl4ai
crawl4ai copied to clipboard
[Bug]: Wrong function params shown in docs for BM25 Content Filter
crawl4ai version
0.6.2
Expected Behavior
The docs have an example where it is showcased how to use BM25 Content Filter. This is a snippet from there:
bm25_filter = BM25ContentFilter(
user_query="machine learning",
bm25_threshold=1.2,
use_stemming=True
)
Any program with this snippet should run normally.
Current Behavior
The current function definition doesn't have any param called use_stemming, instead it has the 'language' param. This is the init function of the BM25 content filter:
def __init__(
self,
user_query: str = None,
bm25_threshold: float = 1.0,
language: str = "english",
)
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
1. Go to https://docs.crawl4ai.com/core/markdown-generation/
2. Scroll down to the 5.1 BM25ContentFilter section
3. There you can see the incorrect param usage snippet.
Code snippets
OS
Windows using WSL(Ubuntu 22.04.5)
Python version
3.10.12
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
Traceback (most recent call last):
File "/home/****///testing_scraping.py", line 32, in