Sawood Alam
Sawood Alam
Semicolons are valid characters in URIs, splitting them while parsing the `Link` header results in unexpected token and wrong `url` attribute (notice `;oldid=934259284` portion the below example). ``` ; rel="original",...
It will be cool if we can have multiple workflow files equipped with comment runner, but have different setups (e.g., one with all the dependencies and services installed to run...
This PR fulfills the feature requested in #112, but differently. This PR declares a special file ``$TMPDIR/comment-buffer.txt`` as a temporary buffer where whatever content is written, will be posted as...
Skip the job when the trigger phrase is missing in a comment.
It will be really helpful to be able to interact with the revisions of pages in some applications. Any plans on adding it in some capacity anytime soon?
I am using the Docker image `andypetrella/spark-notebook:0.6.3-scala-2.11.7-spark-1.6.1-hadoop-2.7.2` as my base image. after adding my code and other dependencies, I set the `ADD_JARS` env var. ``` ENV ADD_JARS=/path/to/warcbase-fatjar.jar ``` Then I...
[WARC](http://iipc.github.io/warc-specifications/) is well-known format for storing crawled captures. It can store arbitrary number of HTTP requests and responses along with other network interactions such as DNS lookups along with their...
I am aware of those examples showing arrow heads with some extra code. But it will be great to have a property in the edge configuration that allows us to...
Currently, the following code is used to split the document in tokens/words for training and classification. ```ruby str.gsub(/[^\p{WORD}\s]/, '').downcase.split ``` This covers general case, but there could be situations where...
Before releasing `2.2.0` magical `train_*` and `untrain_*` style methods should be deprecated. * [x] Add deprecation warning * [ ] Add tests to not use these methods * [x] Update...