starcoder icon indicating copy to clipboard operation
starcoder copied to clipboard

code refactoring

Open Muntahabintealam opened this issue 2 years ago • 2 comments

I want to fine tune star coder for code refactoring tasks and I was thinking if it is possible and in this context, how can I get a dataset and can I use my own code as a dataset?

Muntahabintealam avatar Jul 17 '23 09:07 Muntahabintealam

Hi. If you want to fine-tune StarCoder in order to map instruction + code to modified code (e.g. Change the value of max_length to be equal to 2048 in the following code \n <CODE> -> <CODE> with max_length = 2048) you should be able to do it. I am not sure about how/where to find the dataset but once you have it, you can use the code provided in this repository in order to perform the fine-tuning.

ArmelRandy avatar Jul 17 '23 12:07 ArmelRandy

should the be any specific format for the dataset for code refactoring tasks? https://github.com/eth-sri/TFix can this dataset be used for the code refactoring tasks?

Muntahabintealam avatar Jul 27 '23 09:07 Muntahabintealam