Gokul
Gokul
Also make sure there is a -latest suffix checkpoint for model so that users can download the latest one easily
@Abder88 Currently Tigon supports the following Java object types for emitting from one flowlet to another: All primitive types and boxed types String, enum, arrays and collections User defined POJO...
With this change in https://github.com/pytorch-labs/torchtune/pull/289, we won't be testing HF dataset download in unit test. But we do want to do a nightly run of testing HF load_dataset API.
Looks like adding the index url (for torchdata) is causing other dependencies to not get installed. Will figure out how to fix this
@rlrs Would it be possible to test it after my latest commit ([b9b045d](https://github.com/pytorch/torchtitan/pull/279/commits/b9b045d32933c2824ae6f667e944a51c3255a2d1))? I missed adding that part.
@tianyu-l Addressed PR comments (thank you!), added unit test, and made changes to the github workflows to allow running those unit tests. Let me know if the changes look okay....
@rlrs Thank you for your great analysis here (https://github.com/pytorch/torchtitan/pull/279#issuecomment-2104797493). Helped us narrow down the issue which basically boiled down to in-place loading of checkpoint of DCP. StatefulDataLoader doesn't currently return...
@johnament Please review this PR that fixes the NOTICE file for Apache Tephra when you get a chance. Thank you!
@johnament Thanks for the review John. Please take another look when you get a chance.
Thanks for the PR! Reviewing it now