[references] Recognition - Allow built-in datasets usage by sarjil77 · Pull Request #1904 · mindee/doctr (original) (raw)
@felixdittrich92, here i am able to use the builtin dataset but also facing one issue:
getting this type of error in some of the dataset, it is workign fine for the SVNH but for others which have space between words are facing this kind of issue.
ValueError: some characters cannot be found in 'vocab'. Please check the input string ACUTE CADCOVASCULAR and the vocabulary 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~°£€¥¢฿àâéèêëîïôùûüçÀÂÉÈÊËÎÏÔÙÛÜÇ
and the error is because of input_string: ACUTE CADCOVASCULAR, so space between them causes the error and when i tried to add the space in the vocab.
i got error like regarding:
the vocabulary size in your model does not match the pre-trained checkpoint.
please have a look on this. :)