The FIRST Company to Develop Multimodal AI with 200+ Installations Worldwide

The FIRST Company to Develop Multimodal AI with 200+ Installations Worldwide


AimeSpeech : Speech recognition (STT), speech synthesis (TTS), Speaker identification


AimeSpeech is a core speech processing framework inside the Aimenicorn software ecosystem. AimeSpeech includes a Speech Recognition engine (Speech to Text, STT engine), a Speech Synthesis engine (Text to Speech, TTS engine), Speaker identification library and other advanced speech processing libraries. AimeSpeech supports English, Japanese, Vietnamese, Korean. It can be run both on-premise and on-cloud. It works with PC/Mac and also smartphones.

AimeSpeech is applied in various Multimodal AI products of Aimesoft, such as AimeHotel, AimeReception, Aime AIShop, ...


Please use  Google Chrome browser on PC/Mac (not on smartphones) to view demo Web pages that  use AimeSpeech below. For smartphones, you need  to use AimeSpeech SDK/API for integration.




AimeSpeech Features

  • content
    Speech recognition (Speech-to-Text, STT)
  • content
    Emotion analysis
  • content
    Speech Synthesis (Text-to-Speech, TTS)
  • content
    Speaker identification
  • content
    Sentiment Analysis from Speech features
  • content
    Supports English, Japanese, Vietnamese

AimeSpeech Technologies

Aimesoft’s technologies to realize AimeSpeech

  • product
    Speech Recognition Technologies
    DNN (Deep Neural Networks), LSTM, GMM, HMM, Acoustic modelling, Language model
  • product
    Natural Language Processing technologies
    Tokenization, POS tagging, keyword extraction, word normalization
  • product
    Speech Synthesis Technologies
    HMM, Festival, Kaldi, Tacotron2, DNN, Vocoder
  • product
    Speaker Identification Technologies
    GMM, VGG network

AimeSpeech Benefits

Various applications of AimeSpeech

bản demo TTS

Copyright © 2024 Aimesoft. All Rights Reserved.