Tom Switzer
Tom Switzer
This isn't merge ready yet, but since I know a few other people are interested (eg @non), I figured I'd make the PR now so I can get some more...
To support faster training (especially in local mode) we could be using compressed Bonsai trees directly in Brushfire. There's really no advantage to Brushfire's native tree type so we should...
Our tree generators for scalacheck are fairly complex, so if we created a new `brushfire-laws` package or something that included them, then they could be re-used elsewhere.
This is super early work, but we have a CsvTrainerJob, which can run on ~arbitrary CSVs, with the labels provided by the user. The actual types of the values will...
It seems like we'll need an ordering at some point in order for an error to be useful. In a world of incoherent type classes, it may be prudent to...
If we follow through with #40 and #42, then prediction will be handled by `sumLeaves` and each current `Voter` will essentially become a `VectorSpace` instance + a method to turn...
If data is missing, we should return true (indicating the edge should be followed). This works in conjunction with #40.
It seems to me that it may be nice to chunk the structure into simpler blocks, made up of `(Offset, Chunk Offset Dictionary, Bits)`, encoding those into the array, and...
Rather than requiring a `String`, we could actually take chunked input, such as what's accumulated in `inferDelimitedFormat` in the iteratee module. This would avoid a possibly large string concatenation. It...
The idea here is that instead of having the parser produce fully parsed rows, we may be able parse things a bit quicker by simple parsing the entire row, without...