Google Code Exporter

Results 11117 issues of Google Code Exporter

``` I am trying to use boilerpipe to extract article from URLS containing non-english language. However it generates some ascii text, check this(http://boilerpipe-web.appspot.com/extract?url=http%3A%2F%2Fwww.sandesh.com% 2Farticle.aspx%3Fnewsid%3D2905443&extractor=ArticleExtractor&output=htmlFragment &extractImages=). I saw this issue(https://code.google.com/p/boilerpipe/issues/detail?id=16&q=non%20english). I...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? (using ver. 1.2.0) 1. HTMLParse "http://worldwidescience.org/topicpages/s.html". ArticleExtractor is just fine for demonstration purposes. With 8GB of JVM-memory, this will result in an out...

Type-Defect
Priority-Medium
auto-migrated

``` Instead of using URLConnection in java, if you use HttpURLConnection we can able to access the requested web page from java. Try the following code HttpURLConnection httpcon = (HttpURLConnection)...

Type-Defect
Priority-Medium
auto-migrated

``` Hello Boilerpipe, When ArticleExtractor.INSTANCE.getText(url) is called for a web page that has a code (like below) the function does not return the whole text. The expected returned text [1]...

Type-Defect
Priority-Medium
auto-migrated

``` As title shown, I did not find a way that could fetch the text and its hyperlink from the web page simultaneously. Is there anyone that find the solution...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. Processing the attached file using DefaultExtractor.getText(Reader r) 2. 3. What is the expected output? What do you see instead? An exception of...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. Process a BoilerplateBlockFilter with e.g. labelToKeep = "de.l3s.boilerpipe/HEADING" 2. See that text blocks with the HEADING label are not kept. What is...

Type-Defect
Priority-Medium
auto-migrated

``` Hi, In https://code.google.com/p/boilerpipe/source/browse/trunk/boilerpipe-core/src/mai n/de/l3s/boilerpipe/filters/simple/MinClauseWordsFilter.java I reach a never ending loop from line 73 - 82. The text var is "Ledere, tjenestemænd og lærere, der ikke er medlemmer af Lærernes...

Type-Defect
Priority-Medium
auto-migrated

``` I am using Boilerpipe for both web-api and api . For example on the site http://www.davidicke.com/forum/showthread.php?page=2&t=72909 , Boilerpipe WebAPI working properly while the boilerpipe api return the error "java.io.IOException:...

Type-Defect
Priority-Medium
auto-migrated

``` What steps will reproduce the problem? 1. Attempt to use boilerplate & RichFaces 3.x together. What is the expected output? What do you see instead? First problem you're likely...

Type-Defect
Priority-Medium
auto-migrated