CVG 1.3.1 (14-Aug-2020)

Customized Speech-to-Text and Text-to-Speech

CVG has already integrated speech-to-text and text-to-speech services from various manufacturers such as Microsoft, Google, Amazon and Nuance.

To improve user experience we now enable customized speech recognition to transcribe domain-specific terms and rare words by providing hints and boost your transcription accuracy of specific words or phrases.

In addition, every voice of a speech cloud service can now be used for speech synthesis. Thus, a further individualization of your voicebot is possible.

Speech Cloud Profiles

To customize existing models of speech cloud vendors we have introduced the concept of speech cloud profiles.

Profiles are created on a per customer basis and can then be used by all their accounts and projects. Depending on the kind of profile, Synthesizer (TTS) or Transcriber (STT), and the vendor of the profile, there are different options available. Projects can select either the vendor’s default profile or any specific profile which has been configured for the customer.

Transcriber Profiles (Speech-to-Text)

All transcriber profiles allow the configuration of supported languages of the transcription engine. The default profile does not restrict the languages, any language supported by the vendor will work.

Mircosoft in particular supports the configuration of a customized endpoint. This enables customers to use customized speech models for improved recognition performance.

Synthesizer Profiles (Text-to-Speech)

All synthesizer profiles allow the configuration of a ‘voice to language’ relation. Any voice provided by the vendor will work.

Contact us to configure a specific voice for your Voicebot.

Play an Audio File

A new /call/play endpoint now makes it possible to play an audio file to the call.

Note the following requirements and limitations:

  • The audio file must be hosted at an Internet-accessible HTTP(S) endpoint. In case of HTTPS the domain hosting the audio file must present a valid, trusted SSL certificate. Self-signed certificates cannot be used.

  • The audio file must be a valid wav file (waveform audio file format).

  • PCM A-law or µ-law, 16 bits per sample, and a sample rate of 8000Hz or 16000Hz is required.

Fixed Bugs

Dialogs History disyplays Results

Metadata of Dialogs is again displayed in the CVG Console under Dialogs History, even if it contains callers with suppressed phone numbers.