justext icon indicating copy to clipboard operation
justext copied to clipboard

A Go package that implements the JusText boilerplate removal algorithm

Results 14 justext issues
Sort by recently updated
recently updated
newest added

Hi there! 😊 This repo seems to depend on `github.com/levigross/exp-html` which doesn't ship a license file. This was identified in our CI pipeline using `github.com/google/go-licenses`. To me this looks like...

Document the source code and provide a useful set of examples. Update the ream-me. Use github project pages for coode explanation of the algorithm and examples of use.

project-structure

``` func removeComments(root *html.Node) { var toBeRemoved []*html.Node var markRemovableNodes = func(node *html.Node) { if node.Type == html.CommentNode { toBeRemoved = append(toBeRemoved, node) } } nodeIter(root, markRemovableNodes) for _, node...

examples (showing filename: `grep ` output) ``` 512/3390ce13a50c7593b9ab6fcd539043ab: <style> .s9DpES {display: none; } </style> 512/3390ce13a50c7593b9ab6fcd539043ab: <style> .jsOffDisplayBlock { display: block; } .jsOffDisplayInline { display: inline; } .jsOffVisibility { visibility: visible;...

I believe the latest version of go-bindata takes different arguments and generates significantly different code than the code you have in defaultTemplate.go and detailedTemplate.go. you may want to consider updating...

project-structure

This includes making the project installable via "go install github.com/JalfResi/gojustext"

project-structure

Should stopword languages be in subpackages? Investigate (would speed up compile time!)

project-structure