cpuhrsch

Results 33 issues of cpuhrsch

find_match searches a list of strings and returns first entry that partially or fully contains the given string match.

cla signed

This PR changes the IMDB download to actually use the filename stored to detect whether the data has already been downloaded. This can further prevent unnecessary querying of google drive.

cla signed

This can reduce build time

cla signed

Edit raw.translation dataset to return a RawTextIterableDataset, which uses worker information to restrict the underlying iterator to a subset such that DataLoader won't return duplicate entries, if given an instance...

cla signed

This PR typedefs strings that are meant to be constant, i.e. read-only. They can then be optionally replaced by std::string_view. Also adds the ever so important "#pragma once" to the...

cla signed

Create a single union regular expression that uses a lambda to query a dictionary of patterns for the correct replacement. This causes a significant speedup, however is different from the...

cla signed

Right now the default sometime is "train", "test", "valid" and sometimes (but more commonly) "train", "valid", "test". We should pick a single convention (this PR opts for the latter) to...

cla signed

Language modeling datasets construct *all* datasets even if only a subset is constructed. It also stores the fully numericalized version of the dataset if it's stored as "a single line"...

cla signed

a) The documentation doesn't clearly state that one factory function is meant to be used to construct a Vocabulary from a dataset (e.g. AG_NEWS) and another is meant to be...

cla signed