Even though the model worked fine on a local machine with windows, when passing the path to the espeak-ng library according to this installation guide https://bootphon.github.io/phonemizer/install.html , I could not make it work in a VM under Ubuntu 22.04.4 LTS (x86-64). When running my script to transcribe phonemes via wav2vec2phoneme, I got the following message
Traceback (most recent call last):File "/dialrec/phoneme_transcription/phoneme_recognizers/transcribe.py", line 50, inphoneme_recognizer = Wav2Vec2Phoneme()File "/dialrec/phoneme_transcription/phoneme_recognizers/wav2vec2phoneme.py", line 24, in initself.processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-xlsr-53-espeak-cv-ft")File "/usr/local/lib/python3.10/site-packages/transformers/models/wav2vec2/processing_wav2vec2.py", line 52, in from_pretrainedreturn super().from_pretrained(pretrained_model_name_or_path, **kwargs)File "/usr/local/lib/python3.10/site-packages/transformers/processing_utils.py", line 465, in from_pretrainedargs = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)File "/usr/local/lib/python3.10/site-packages/transformers/processing_utils.py", line 511, in _get_arguments_from_pretrainedargs.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 837, in from_pretrainedreturn tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2086, in from_pretrainedreturn cls._from_pretrained(File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2325, in _from_pretrainedtokenizer = cls(*init_inputs, **init_kwargs)File "/usr/local/lib/python3.10/site-packages/transformers/models/wav2vec2_phoneme/tokenization_wav2vec2_phoneme.py", line 153, in initself.init_backend(self.phonemizer_lang)File "/usr/local/lib/python3.10/site-packages/transformers/models/wav2vec2_phoneme/tokenization_wav2vec2_phoneme.py", line 202, in init_backendself.backend = BACKENDS[self.phonemizer_backend](phonemizer_lang, language_switch="remove-flags")File "/usr/local/lib/python3.10/site-packages/phonemizer/backend/espeak/espeak.py", line 45, in initsuper().init(File "/usr/local/lib/python3.10/site-packages/phonemizer/backend/espeak/base.py", line 39, in initsuper().init(File "/usr/local/lib/python3.10/site-packages/phonemizer/backend/base.py", line 77, in initraise RuntimeError( # pragma: nocoverRuntimeError: espeak not installed on your system
For installing espeak, I followed these steps:
- apt-get install espeak-ng
- pip3 install phonemizer
- pip3 install espeakng (also tried pip3 install py-espeak-ng)
Espeak is definitely installed under /usr/lib/x86_64-linux-gnu/libespeak-ng.so.1 and /usr/bin/espeak-ng.
I tried the following:
- without additional steps
- Setting the environmental variable PHONEMIZER_ESPEAK_LIBRARY='/usr/lib/x86_64-linux-gnu/libespeak-ng.so.1' and PHONEMIZER_ESPEAK_PATH='/usr/bin/espeak-ng'.
- Setting the environmental variable directly in the script withos.environ['PHONEMIZER_ESPEAK_LIBRARY'] = '/usr/lib/x86_64-linux-gnu/libespeak-ng.so.1'os.environ['PHONEMIZER_ESPEAK_PATH'] = '/usr/bin/espeak-ng'
I would appreciate any help. Thanks in advance.