BotFramework-WebChat icon indicating copy to clipboard operation
BotFramework-WebChat copied to clipboard

TTS Stripping text from Markdown (Italian)

Open alessiodecastro opened this issue 4 years ago • 3 comments

Hi, I'm reporting this issue occurred using web chat js library (latest version https://cdn.botframework.com/botframework-webchat/latest/webchat.js) I've enabled the webchat to use Cognitive Services Speech Services as described here: https://github.com/microsoft/BotFramework-WebChat/blob/master/docs/SPEECH.md



(async function () {
            const styleOptions = {
                botAvatarImage: 'xxxxxxxxx.png',
                botAvatarInitials: 'BT',
                userAvatarImage: 'xxxxxxxx.png',
                userAvatarInitials: 'US',
                bubbleBackground: 'rgba(5, 126, 252, .24)',
                bubbleFromUserBackground: 'rgba(255, 165, 0, .1)',
                hideUploadButton: true
            };

            const store = window.WebChat.createStore({}, ({ dispatch }) => next => action => {
                if (action.type === 'DIRECT_LINE/CONNECT_FULFILLED') {
                    // When we receive DIRECT_LINE/CONNECT_FULFILLED action, we will send an event activity using WEB_CHAT/SEND_EVENT
                    dispatch({
                        type: 'WEB_CHAT/SEND_EVENT',
                        payload: {
                            name: 'webchat/join',
                            value: { language: window.navigator.language }
                        }
                    });
                }
                return next(action);
            });

            window.WebChat.renderWebChat(
                {
                    //this is for dev/test env, in prrod change to token exchange approach!
                    directLine: window.WebChat.createDirectLine({
                        secret: 'xxxxxxxxxxxxxxxxxxxxxx'
                    }),
                    locale: 'it-IT',
                    styleOptions,
                    store,
                    webSpeechPonyfillFactory: await window.WebChat.createCognitiveServicesSpeechServicesPonyfillFactory({
                        credentials: {
                            region: 'westeurope',
                            subscriptionKey: 'xxxxxxxxxxxxxxxxxxxxxx'
                        },
                        textNormalization: 'lexical'
                    })
                },
                document.getElementById('webchat')
            );

            document.querySelector('#webchat > *').focus();

        })().catch(err => console.error(err));


Among the supported features from the TTS there is "TTS Stripping text from Markdown", this is working well when the language is set to "en-US" but switching to "it-IT" no markdown stripping is happening anymore. Outside this bot framework scenario I tested the same thing on the official TTS presentation web page: https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/#features If you run a quick test in the official demo page section, set something like "This is really ***very*** important markdown text." and the language set to English (United States); The result will read correctly ignoring markdown syntax. Now switch to Italian language, keeping the same text, you will note that the voice will read also the markdown syntax. Is there an existing language limitation for this feature or something wrong on my configuration?

Thank you.

alessiodecastro avatar Apr 20 '21 16:04 alessiodecastro

Opened issue also in https://github.com/microsoft/cognitive-services-speech-sdk-js/issues/362

alessiodecastro avatar Apr 20 '21 16:04 alessiodecastro

I will do some investigations:

  • Build a bot that will reply with an activity in en-US and it-IT, with some words in ***
    • This activity should not have speak property
  • On network tab, check what we are sending to Cognitive Services TTS
    • Did we strip it out? Or did Cognitive Services strip it out?

@alessiodecastro you may want to use speak property to specify the text for TTS. This is recommended because stripping Markdown may not work in all scenarios, especially Markdown with HTML syntax (see #3615).

compulim avatar May 06 '21 20:05 compulim

This is unlikely to fit into our next release (R14, around end of June). As we have workaround for customer (use speak property), we are not blocking them for now.

As we didn't setup the R15 board yet, will put it under R14 Candidates for now.

compulim avatar Jun 02 '21 22:06 compulim