python-boilerpipe icon indicating copy to clipboard operation
python-boilerpipe copied to clipboard

Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages

Results 28 python-boilerpipe issues
Sort by recently updated
recently updated
newest added

Hey guys, First of all thanks for python-boilerpipe Trying to use Boilerpipe but can't extract properly some documents... > from boilerpipe.extract import Extractor > extractorType="DefaultExtractor" > sourceUrl = 'http://www.indiatimes.com/news/india/arvind-kejriwal-to-seek-political-sanyas-127620.html' >...

This could be pretty cool, and I see it's available in boilerpipe itself.

I am having problems importing boilerpipe: ## In [2]: from boilerpipe.extract import Extractor RuntimeError Traceback (most recent call last) in () ----> 1 from boilerpipe.extract import Extractor /usr/local/lib/python2.7/dist-packages/boilerpipe/**init**.py in ()...

Hello, thanks for this wrapper, this is wonderfully useful. Installing it took me a few minutes, maybe these notes will help someone else (and perhaps could be added to the...

I used your files to install boilerpipe with sucess: ![image](https://cloud.githubusercontent.com/assets/7326531/2915504/2b434e24-d6ae-11e3-8583-97187c555eb9.png) I had previously created the JAVA_HOME variable under the (x86) Program Files. When I try to run the "from boilerpipe.extract...

Hey, Python-biolerpipe work perfectly from the console and as a script but when i trying it out with my flask application it breaks .This break when i try to instantiated...

Hi, when running this code on my Ubuntu 12.04 micro-instance: # !/usr/bin/python # import boilerpipe from boilerpipe.extract import Extractor extractor = Extractor(extractor='ArticleExtractor', url="http://europe.wsj.com/home-page") extracted_text = extractor.getText() print extracted_text extracted_html =...

Hello, Firstly thank you for python-boilerpipe. When i use wget to get the page http://www.flipkart.com/dell-xps-13-laptop-2nd-gen-ci7-4gb-256gb-ssd-win7-hp/p/itmdg387gmhzhx3m and save it on my disk and then try to open it with python-boilerpipe using...