AimeSpeech : Speech recognition (STT), speech synthesis (TTS), Speaker identification

AimeSpeech is a core speech processing framework inside the Aimenicorn software ecosystem. AimeSpeech includes a Speech Recognition engine (Speech to Text, STT engine), a Speech Synthesis engine (Text to Speech, TTS engine), Speaker identification library and other advanced speech processing libraries. AimeSpeech supports English, Japanese and Vietnamese. It can be run both on-premise and on-cloud. AimeSpeech is applied in various Multimodal AI products of Aimesoft, such as AimeHotel, AimeReception, Aime AIShop, ...




    Speech recognition (Speech-to-Text, STT)
    Emotion analysis
    Speech Synthesis (Text-to-Speech, TTS)
    Speaker identification
    Sentiment Analysis from Speech features
    Supports English, Japanese, Vietnamese


Aimesoft’s technologies to realize AimeSpeech

    Speech Recognition Technologies
    DNN (Deep Neural Networks), LSTM, GMM, HMM, Acoustic modelling, Language model
    Natural Language Processing technologies
    Tokenization, POS tagging, keyword extraction, word normalization
    Speech Synthesis Technologies
    HMM, Festival, Kaldi, Tacotron2, DNN, Vocoder
    Speaker Identification Technologies
    GMM, VGG network


Various applications of AimeSpeech