MidiTok icon indicating copy to clipboard operation
MidiTok copied to clipboard

MIDI / symbolic music tokenizers for Deep Learning models 🎶

Results 20 MidiTok issues
Sort by recently updated
recently updated
newest added

![image](https://user-images.githubusercontent.com/103300829/180384656-fd6ecaff-058c-4db3-8a64-e7a81c15f8dd.png) ValueError: invalid literal for int() with base 10: 'Ignore'

I have noticed that there is a significant performance gap between two different scripts I am using to tokenize my dataset. The first script, which filters MIDI files and handles...

enhancement
good first issue

Following the discussions in #147, this PR implements a new PyTorch `Dataset` class, with the particularity of splitting the MIDI itself instead of splitting the tokens as done previously. ----...

Tokenization tests with this feature are not passing right now, `_time_ticks_to_tokens` have to be adapted to handle overlapping resolutions ---- 📚 Documentation preview 📚: https://miditok--144.org.readthedocs.build/en/144/

I am using the Structured tokenizer for a project and I noticed that encoded files follow the order: `Timeshift, Pitch, Velocity, Duration` In the docs however it indicates that the...

after tokenizing a song with a trained tokenizer, the "tokens" array contains only the base tokens, the "ids" array is fine containing newly generated vocab, i was wondering if this...

The error: ```UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 0: invalid continuation byte``` The stack trace: ``` [00:05:34] Pre-processing sequences ███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 93583 / 93583 [00:00:04] Tokenize words ███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████...

I am hoping to learn how to use miditok to process midi files for use with pytorch but I have not been able to get the Example_HuggingFace_Mistral_Transformer.ipynb notebook to complete...

Hi! Thank you for this amazing library @Natooz i would want to understand the duration token! ![Screenshot from 2024-05-31 13-43-23](https://github.com/Natooz/MidiTok/assets/16807496/69f76791-789b-48cf-9a40-20dd8ec39328) Bar_None TimeSig_4/4 Position_0 Tempo_121.29 Pitch_69 Velocity_91 Duration_4.0.4 Bar_None TimeSig_4/4 Position_0...

with some files it runs successfully, for a bigger dataset I get this error. Thanks in advance

bug