CVG 1.35.1 (04-Oct-2023)

Over the past few months, we’ve received consistent feedback on two features many of you desired. We’re pleased to announce that with the latest release of our Cognitive Voice Gateway, we’ve addressed these requests. Firstly, the voice recognition for specific words and phrases can now be improved with phrase lists without the need to train a customized speech model. Secondly, we’ve significantly reduced the access times for querying dialog data (Call Data Records), especially for large projects.

Additionally, we are continuously expanding our ChatGPT integration. This includes the provision of the current time and the use of function calls for forwarding calls even when using Azure OpenAI.

You can also now utilize UUI Headers during call acceptance and call forwarding to transfer additional information about the call, in addition to the Custom SIP Headers. Lastly, we’ve made further improvements to the search functionality in the CVG UI.

Dive in and explore the exciting updates we’ve rolled out for you. We’re sure you’ll love them!

New release with improved voice regocgnition and faster access to call data records

Phrase lists for optimized Speech Recognition

Both Google and Microsoft Speech-to-Text (STT) allow to specify phrase lists as a hint, which phrases should be considered preferentially during transcription. Some examples of such phrases that should be recognized preferentially over very similar sounding phrases in certain contexts are:

  • “flour” if it should be recognized preferentially before “flower”

  • “cell” if this should be recognized preferentially before “sell”

  • “fairy” if this should be recognized preferentially before “ferry”.

You can specify such “phrase lists” now by creating corresponding speech profiles in the CVG UI with the relevant phrases.

If you don’t enter any credentials (such as REST endpoint ID or Azure region) you can use existing speech profiles (such as the standard speech profiles as provided by VIER) and just add phrases. This results in a new profile based on the existing profile.

Speech profile for Microsoft STT with phrase list

Such a speech profile must then be selected for the corresponding CVG project so that the phrases stored there are recognized preferentially. This can be done either directly in the CVG UI or via a transcriber switch using the /call/transcription/switch endpoint or your Conversational AI.

Speech Profile with Phrase Lists assigned to a CVG project

Hints for optimized speech recognition are just hints

Do not expect that the words specified in the phrase list will always be recognized, it will only increase the probability.

Do not expect that you can teach the STT unlimited new words by providing phrase lists. To some extent, new vocabulary such as certain product names may well be recognized much better via such phrase lists. However especially for special pronunciation, we still recommend the use of customized speech models.

Contact us if you have the requirement to provide such phrase lists directly and ad-hoc via API or from your Conversational AI in the future.

Significantly reduced Response Times for Call Data Records (CDRs)

Dialog data, or Call Data Records (CDRs), contain metadata related to a call. These CDRs provide the necessary information for itemized call records, while the billing data provides aggregated usage information for a CVG project.

You can access these CDRs via the UI by selecting the “Dialogs” tab within a project. Alternatively, the data can be queried through the API endpoint /cdr/dialogs/{resellerToken}.

Both the display in the UI and the query via the API have now been significantly accelerated.

Improvements of ChatGPT Integration

Let ChatGPT know Date and Time

In some use cases it is necessary that ChatGPT knows the current date as well as the time. A simple example is the time of day dependent greeting at the beginning of a conversation (”Good morning”, “Good day”, Good evening”).

You can therefore now pass the time as a parameter both within the system message and within the greeting prompt. Just use the placeholder {time, [timezone]}. The timezone can be specified as timezone ID (e.g. Europe/Berlin or US/Eastern) or as zone offset (e.g. +01:00 or -05:00).

Providing date and time in ChatGPt greeting prompt

Call Forwarding via Function Calls with Azure

For several months now, bots built with ChatGPT have been able to use Function Calls to forward calls, as long as OpenAI is the provider of ChatGPT.

In the meantime, Azure has followed and also enables function calls. You can now use function calls for your bots built with ChatGPT even when using Azure to forward calls. We recommend this type of forwarding, as the approach using JSON generation was only an interim solution.

Function Calls require specific GPT-models

The use of Functions Calls requires GPT models that were released on 13.06. or later. Currently Azure does not offer these models in the "West Europe" region, but in "France Central" for example. Please deploy an appropriate model in Azure OpenAI and link it in the appropriate CVG project before switching to Function Calls for call forwarding.

Custom SIP Headers and UUI Header

If calls are to be forwarded to another destination after using the voicebot, it is essential to be able to transfer call metadata with the call. Until now you could use custom SIP headers for this purpose.

However, since some target systems only evaluate UUI headers, you can now also set and get UUI headers.

Limit of 128 bytes of data

Custom SIP headers and the UUI header combined must not exceed 128 bytes of data.

UUI header can be set with /call/forward and /call/bridge

With the additional optional parameter userToUserInformation at /call/forward and /call/bridge you can set the user-to-user information header.

UUI header of incoming calls available for bots

For incoming calls where the UUI header is set, CVG now provides you with this UUI header via the new userToUserInformation parameter in the /session request.

Custom SIP header and UUI header can be set with /call/refer

As for /call/forward and /call/bridge, you can now set Custom SIP header(s) and the User-to-User Information header when you transfer a call via /call/refer.

Adding custom Health Events to a Dialog

In some cases it makes sense that a bot itself wants to generate a health event and save it with the dialog. An example of this is when a service is not available for the bot or this service does not provide a result (e.g. a large language model like ChatGPT does not generate a completion).

To allow your bots (and also the integrations we provide e.g. to botario, ChatGPT, Cognigy and Rasa) to “write” such health events, there is now the new endpoint /health/{resellerToken}/dialog/{dialogId} to be called via POST.

Such custom health events are handled according to the standard CVG health events both in the Health API and in the UI (display of dialogs, status page of a project). With this, we would like to make it even easier for you to analyze bugs.

As of yet, our bot integrations do not generate such healht events, but now that the stage is set, we will quickly add that. We hope that you will also make full use of this possibility.

Improved Handling of Recording Issues

A bot built with ChatGPT will work even if it cannot start a recording.

SIP Refer in Rasa Integration

Call forwarding via SIP Refer is also supported within the VIER Voice Extension for Rasa.

UI Improvements

Keyboard Shortcuts in Search Result List

The usual keyboard shortcuts now work in the search result list:

  • With up and down you can select the previous / next result.

  • With enter you can select a result.

The first result is pre-selected so you can press enter right away.

Other Fixes and Improvements

No Health Event when trying to stop a not running Playback

Because playbacks may have already finished when calling /call/play/stop, a 404 response now no longer throws a health event.

No Tries to stop Recordings that are already stopped

To avoid unnecessary health events, the system no longer attempts to end a recording if no recording is active for this dialog, e.g. because the maximum recording duration has been exceeded.