xtuner icon indicating copy to clipboard operation
xtuner copied to clipboard

Resolved TODO in tokenize_ftdp_datasets.py

Open neverbiasu opened this issue 1 year ago • 0 comments

  • [x] This PR addresses a TODO in the tokenize_ftdp_datasets.py file.

Solution: I used the lstrip method to remove all leading newline characters from the content string. This way, we can avoid situations where the string starts with \n\n.

content = content.lstrip('\n')

neverbiasu avatar Apr 17 '24 15:04 neverbiasu