Google Code Exporter

Results 11117 issues of Google Code Exporter

``` Hi i am new to using this extractor while i am trying to run as simple extractor using only the boilerpipe-1.2.1.jar i am getting a unsupported Content type error....

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. if boilerpipe is at a higher precedence than CyberNeko library, then it will cause parsing issue on user input with unbalanced tags...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. call ArticleExtractor.getInstance().getText() on the example data (Stability.html) What is the expected output? What do you see instead? The extraction takes a very...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. Missing de.l3s.boilerpipe.sax.ImageExtractor What is the expected output? What do you see instead? Rebuilding jar from source has the missing de.l3s.boilerpipe.sax.ImageExtractor class file....

Type-Defect
Priority-Medium
auto-migrated

``` With boilerpipe-1.2.0.jar ArticleExtractor.INSTANCE.getText(new java.net.URL("http://t.co/3RplOLjc")) produces ERROR java.lang.IllegalArgumentException: protocol = http host = null at de.l3s.boilerpipe.sax.HTMLFetcher.fetch (HTMLFetcher.java:33) at de.l3s.boilerpipe.extractors.ExtractorBase.getText (ExtractorBase.java:87) This happens for many other URLs e.g. http://t.co/5vuYimwn http://t.co/Dy5yolLs http://t.co/ShWhtFjP...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. extract content from the page (in Chinese) with ArticleExtractor http://www.ccgp.gov.cn/cggg/zybx/zbgg/201407/t20140731_3655909.shtml What is the expected output? What do you see instead? Footnote is...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1.Give the URL as : http://www.newyorker.com/news/amy-davidson/shattered-school-gaza-2 2.Keep the extractor strategy as artcle extractor 3.Extract What is the expected output? What do you see...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1.String content = CommonExtractors.DEFAULT_EXTRACTOR.getText(new URL("http://www.nytimes.com/2014/06/06/business/gm-ignition-switch-internal-reca ll-investigation-report.html?hp")); 2.System.out.println(content); 3.It prints nothing When I run with the above URL, its not extracting anything. I have...

Type-Defect
Priority-Medium
auto-migrated

``` Hi, I have a problem when I included Xerces.jar into build path in order to use boilerpipe. It gives me error when trying to run android application, UNEXPECTED TOP-LEVEL...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1.2.0 doesn't exist in the central maven repository ``` Original issue reported on code.google.com by `[email protected]` on 26 Mar 2014 at 5:22

Type-Defect
Priority-Medium
auto-migrated