Librispeech_train_4_1030.subword Not compatible
Hello all, I am a beginner in this domain and doing my first experiment. after training the model, I got the .h5 model file. But while testing, Librispeech_train_4_1030.subword is not reading the vocabulary, it is just reading the first word e.g. 'That', it's reading T not all.
Subwords are..
SubwordTextEncoder
Metadata: {}
'the_' 'and_' 'of_' 'to_' 's_' 'd_' 'a_' 'e_' 't_' 'y_' 'in_' 'ed_' 'r_' 'i_' 'he_' 'that' 'was_' 'ing_' 'it_' 'n_' 'her_' 'his_' 'with' 'on_' 'as_' 'for_' 'you_' 'had_' 'h_' 'is_' 'not_' 'l_' 'but_'
Error
INFO:tensorflow:Reading /mydata/hassan/data/LibriSpeech/test-clean/transcript.tsv ... {"'": 773} ['a', 'n', 'd', ' ', 'o', 'f', 't', 'e', 'n', ' ', 'h', 'a', 's', ' ', 'm', 'y', ' ', 'm', 'o', 't', 'h', 'e', 'r', ' ', 's', 'a', 'i', 'd', ' ', 'w', 'h', 'i', 'l', 'e', ' ', 'o', 'n', ' ', 'h', 'e', 'r', ' ', 'l', 'a', 'p', ' ', 'i', ' ', 'l', 'a', 'i', 'd', ' ', 'm', 'y', ' ', 'h', 'e', 'a', 'd', ' ', 's', 'h', 'e', ' ', 'f', 'e', 'a', 'r', 'e', 'd', ' ', 'f', 'o', 'r', ' ', 't', 'i', 'm', 'e', ' ', 'i', ' ', 'w', 'a', 's', ' ', 'n', 'o', 't', ' ', 'm', 'a', 'd', 'e', ' ', 'b', 'u', 't', ' ', 'f', 'o', 'r', ' ', 'e', 't', 'e', 'r', 'n', 'i', 't', 'y'] Traceback (most recent call last): File "examples/conformer/test.py", line 73, in <module> fire.Fire(main) File "/mydata/anaconda3/envs/tensorflow_25/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/mydata/anaconda3/envs/tensorflow_25/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/mydata/anaconda3/envs/tensorflow_25/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "examples/conformer/test.py", line 67, in main test_data_loader = test_dataset.create(batch_size) File "/mydata/hassan/TensorFlowASR/tensorflow_asr/datasets/asr_dataset.py", line 384, in create self.read_entries() File "/mydata/hassan/TensorFlowASR/tensorflow_asr/datasets/asr_dataset.py", line 141, in read_entries self.entries[i][-1] = " ".join([str(x) for x in self.text_featurizer.extract(line[-1]).numpy()]) File "/mydata/hassan/TensorFlowASR/tensorflow_asr/featurizers/text_featurizers.py", line 208, in extract indices = [self.tokens2indices[token] for token in text] File "/mydata/hassan/TensorFlowASR/tensorflow_asr/featurizers/text_featurizers.py", line 208, in <listcomp> indices = [self.tokens2indices[token] for token in text] KeyError: 'a'
Many Thanks
您好,您的邮件我已收到。我会尽快给您回复。祝好!
but when i am doing with simple alphabets or like mentioned below, it's working but the outputs sucks
SubwordTextEncoder
Metadata: {}
the_ and_ of_ to_ s_ d_ a_ e_ t_ y_ in_ ed_ r_ i_ he_ that was_ ing_ it_ n_ her_ his_ with on_ as_ for_ you_ had_ h_ is_ not_ l_ but_ at_
output is like
`farewell madam ihlldihslk'iiaaalh'ioo'' ''ih'''iiiiiiiiii
though i may be but an ungracious adviser you will allow me therefore to subscribe myself with the best wishes for your happiness here and hereafter your true friend robert southey gyhihllllllllzz lk' ehh' iihmmiihiilqqlql iilvo''io'iio''lualhlhhiiiiiiiiiiiiiiiihvhiio l h' l l ''''''''''qlzzpslp o'elvgvqqk''' r''hhibb''''jbj j rhnmiiiik'iihlliiiiqvmihlqpiioaqlq'iihbbjojqquiihvveiypjq''jo'iiogulvepuuueawgiiieuulvbeiioqliiiiqqqeoqewtaeiigq'''''qlvjl'''''''''''iolllqlhlzlpqebehvgulgllllll ioop
sir march sixteenth rihlllllllql l kp r'iiiiiiihviigiiqqqwioolqiiqqqeqqqqlqlvjlglgllllllll
i had not ventured to hope for such a reply so considerate in its tone so noble in its spirit ihlihnihvhiiioojiihiiiiiihiiyh'ihvuiiguuurrrik'igiik' b d''''iiiiiiioqlk'iiiooo''ebtylliihiiiiqqihaaqqk'iiqqqllllzzk'iook'iogmiiiiiiqiihhrraaalliiik'igii
i know the first letter i wrote to you was all senseless trash from beginning to end but i am not altogether the idle dreaming being it would seem to denote ihnihlk' lnlpaptiihhriiihs''''illmiik''''''iiiiik' l l ''' lqlpal'''jo'illl'''''''''''''''''''ql ek''''''iiqqqq''''ioozeqvvv'''ioollpiik''ddddihhiiik''''''iiiozeztiiiiiiiihvhmiihp''''''''ik' joalk'vglgllllalia
i thought it therefore my duty when i left school to become a governess ihllllllllhhik'eeq'iiioasbgvasl'''''iiiiiheihiiojiihuiiihhyoqlmihvmehvaaaq''io'ioguo'''''jo' lhlllll'ii
help!!!
您好,您的邮件我已收到。我会尽快给您回复。祝好!
Please try v2.x, the version 1.x is no longer supported
The 2.x version uses direct implementation from tensorflow-text for wordpiece and sentencepiece, pretrained models are also available
Close here, feel free to reopen if you have further questions