CodeGen
CodeGen copied to clipboard
Parallel dataset generated by TransCoder-ST
Hi! I'm just wondering if you could release the parallel dataset generated by TransCoder-ST. It takes a long time to generate and filter out translations from the whole GitHub dataset.
Thanks!