David Hou
David Hou
Theres no general policy on breaking changes in CONTRIBUTING.md; I think it'd be useful to have one.
WIP. Code/progress is not interesting at the moment. Posting just for reference. Approach: - _pool 4x4 with stride=2*stride - permute kernel HW to the first 2 dims. - winograd by...
This is part 1/k for winograd ( #1037 ). I will be splitting off orthogonal changes from that PR so that I can get some feedback / have less going...
this used to at least produce non-nan losses. testing it...
these should all pass (or assert) not comprehensive!
Ensure occupancy. Optimize layout for group. Don't waste big memory. Probably really slow without beam
There are many places in the codebase where fp16 is not adequate for some particular calculation. sum() is handled well; it selects least upper dtype with float32. We might have...
when a tensor has different dtype then default_float, backward will initialize gradient to the wrong type (default instead of self)
a previously failing test and a quick fix. checking for unsafe pads can just be its own pass. need to think about this expand rule!
if you write t.expand(), the backwards is sum(), but if t is fp16, then the sum will be in fp16 and may not have a good time. need to measure...