VoiceXML 2.1 Development Guide Home  |  Frameset Home

  tutorial Shadow Variables  |  TOC  |  tutorial Outbound VoiceXML Applications via HTTP  

Tutorial: NBest Post-Processing

This tutorial is meant for advanced VoiceXML coders, and is based on concepts learned from previous tutorials. If you have not completed the preceding tutorials, go back and work them now, before you swim too far...there isn't a lifeguard on duty. A base knowledge of JavaScript, specifically array, and loop constructs, is also recommended for completion of this tutorial.

In this tutorial, we will:

Step 1: Why use an NBest list?

Sometimes in our VoiceXML applications, we will find that we cannot get a grammar as accurate as we would like, especially if we have complex, multi-worded utterances in our grammar that are phonetically similar. For illustrative purposes, we will debunk a snippet of a pop song lyric interpretation gone horribly, horribly awry. Assume our grammar looks like this:


<grammar xml:lang= “en-us” root= “TOPLEVEL”>
    <rule id= “TOPLEVEL” scope= “public”>
      <one-of>
          <item> big old jet airliner <tag>Steve Miller</tag></item>
          <item> bingo jet said right on <tag>Bingo</tag></item>
          <item> jango feet left a light on <tag>Jango</tag></item>
          <item> big old bet on a rhino <tag>Rhino</tag></item>
      </one-of>
    </rule>
</grammar>


Chances are, upon an utterance of "big old jet airliner," the VoiceXML interpreter will become confused and serve up whichever one it thinks is best, regardless of whether or not it was the correct match, and then move on to the next action in our script. Frustrating, eh?

Good thing for us there is a technique known as 'NBest searching' in spoken language processing; this technique can find all the utterance matches that sound similar and group them together in an array of values, that is programmatically available to the developer. This search through possible correct matches is based on a number of things, such as word graphs, confidence scoring, and word lattices of a spoken utterance. Crazy sounding jibba-jabba? You bet it is! But, we don't need to know the difference between a word lattice and a bag of sand crabs. We simply need to add in simple client-side JavaScript.

Once we have mastered Nbest lists, you will find it an extraordinarily powerful tool to use, especially in such cases as when constructing an alphanumeric grammar, or using it in a large company directory grammar, when we might have several similar sounding names.

Step 2: Creating an Initial VoiceXML File

If we are going to be using NBest post processing, we are need a form-field-grammar combo as a starting point to get the ball rolling. We also want to write a marginally ambiguous grammar, giving us several possible matches for the same utterance -- so we will deliberately try to confuse the VoiceXML interpreter! And what better way to confuse anything, or anyone than by using Brian Wilson and the Beach Boys as an example. The surf's up, so lets define our grammar file and catch us a wave, (or a beach bunny):


<?xml version= "1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root = "SONG">
    <rule id="SONG" scope="public">
        <one-of>
            <item> little deuce coop </item>
            <item> little deux coup </item>
            <item> little doo scoop </item>
            <item> little boot scoop </item>
            <item> little blue sloop </item>
        </one-of>
    </rule>
</grammar>


Next, our XML file itself:


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />



  <var name="resultLen"/>
  <var name="resultArray"/>
  <var name="myArray"/>

  <form id="F1">
    <field name="F_1" slot="F_1">
      <prompt bargein="true">
        what the heck were the beach boys talking about in
        that song? I bet you don't know either.
      </prompt>

      <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

      <filled>
      </filled>
    </field>

    <block name="B_1">
     

      <field name="Bool_0" type="boolean">
        <prompt>
        </prompt>
        <filled>
        </filled>
      </field>
    </block>
  </form>
</vxml>


We have added a field-level <property> designating that we do want to enable the Nbest process, and that the maximum amount of possible matches for our NBest list to cycle through is '5'. We have also declared a few empty variables to later use – as these will hold the values for length of our array and the actual value of the array itself, respectively. We also put in a <block> to hold our post-processing script (to delete unwanted utterances) and a Boolean <field> grammar to prompt the user to confirm his utterance before moving on. Fear not, we will fill in all the blanks, and have a complete script before the sundown beach-blanket bingo party.

In plain English, what we are doing is finding out how many possible matches we have for an utterance. If we have but one match, then that is just fine as paint -- we won't need any NBest processing at all. Ah, but if we have more than one match, we will group them together in an array and return to our regularly scheduled VoiceXML program. But wait, we still have a bit more JavaScript to add within our <block>, remember? Let's skip right on down and see what this is all about:


<block name="B_1">
  <scrip language= javascript>
    <![CDATA[
        //Define the parameters of our function
        function deleteElement(array, n) {

        //Set the length of the array equal to a variable called "Length"
        var length = array.length;

        //If our count is greater or equal to the array length or if it is equal to zero, do nothing
        if (n >= length || n<0)
          return;

        //Loop from 'n' to the array length minus one, incrementing the value of 'n' each time
        for (var i=n; i<length-1; i++)

          //The value of the array element (i) becomes the value of the element ahead of it
          array[i] = array[i+1];

        //When the loop is done, shorten the array length which will remove the last element and sets it to "undefined"
        array.length--;
      }

      //Calls the function
      deleteElement(myArray, 0);
    ]]>
  </script>
</block>


The skinny on the above script is that we set up some conditionals to regulate what we return to VoiceXML if we run out of possible matches. Once we have this safeguard in place, we will set up the "guts" of our function, deleting the first array element and moving all of the remaining elements up a notch. For example, if there is an array of size two then array element one becomes array element zero and array element two becomes array element one.

We will then control the application flow so we only hit this block of JavaScript when our caller specifies a returned utterance is incorrect. Let's complete our Boolean field to flesh this out, and plug in our nifty new JavaScript code, sans the wordy comments from the peanut gallery.


Step 3. Catching Some (Ar)rays

Okay, so here is our half-finished code, all laid out with our JavaScript nested in the appropriate snuggly places. Of course, we sneaked a few extra things in there as well.


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">

<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />


  <var name="resultLen"/>
  <var name="resultArray"/>

  <form id="F1">
    <field name="F_1" slot="F_1">
      <prompt bargein="true">
        what the heck were the beach boys talking about in
        that song? I bet you don't know either.
      </prompt>

      <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

      <catch event="nomatch noinput">Sorry, I didn't get that.
        <reprompt/>
      </catch>

      <filled>
        <assign name="resultArray" expr="application.lastresult$"/>
        <assign name="resultLen" expr="resultArray.length"/>
        <goto nextitem="Bool_0"/>
      </filled>
    </field>

    <block name="B_1">
      <scrip language= javascript>
    <![CDATA[
        //Define the parameters of our function
        function deleteElement(array, n) {

        //Set the length of the array equal to a variable called "Length"
        var length = array.length;

        //If our count is greater or equal to the array length or if it is equal to zero, do nothing
        if (n >= length || n<0)
          return;

        //Loop from 'n' to the array length minus one, incrementing the value of 'n' each time
        for (var i=n; i<length-1; i++)

          //The value of the array element (i) becomes the value of the element ahead of it
          array[i] = array[i+1];

        //When the loop is done, shorten the array length which will remove the last element and sets it to "undefined"
        array.length--;
      }

      //Calls the function
      deleteElement(myArray, 0);
    ]]>
  </script>

      <field name="Bool_0" type="boolean">
        <prompt>
        </prompt>
        <filled>
          <if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">
            <clear namelist ="Bool_0 F_1"/>
            <prompt>
              I am having trouble getting your response, mushmouth.
              lets start over, shall we?
            </prompt>
            <goto next="#F1"/>
          </if>
                <if cond="Bool_0==true">
            <prompt>
            </prompt>
          <else/>
          <clear namelist="Bool_0"/>
          <goto nextitem="B_1"/>
          </if>
      </filled>
      </field>
    </block>
  </form>
  <form id="F2">
    <block>
      <prompt>
      </prompt>
      <goto next="#F1"/>
    </block>
  </form>
</vxml>


Notice that we assign the variable named "resultArray" to the value of our shadow variable "application.lastresult$". This is an important step, as this variable now holds the entire array of values filled upon the caller's utterance. We are also going to need to find out the length of the array, so we can determine just how many valid matches we have for a particular utterance; thus, the assignment of "resultLen" into a variable.

Our last <field> construct is meant to query the user and confirm the caller's original statement, correct? If our array length of possible matches is equal to one, and the caller replies with a negative answer, we are going to want to send the caller back to the beginning, as this will programmatically indicate that there were no other matches left.

In order to handle this, we have added a conditional statement based on our "resultArray.length", and cleared out the namespace for the Boolean field so we may visit it again once the initial grammar choice is made. Of course, we still need to explicitly tell the interpreter what to do next, so we add a <goto> statement directing the application flow back to our very first field.

Still with us? Brian Wilson sure isn't, but if you have followed along this far, we still have just a little bit of work left to do, and then it's "fun, fun, fun." Until, of course, your Daddy takes the T-Bird away.


Step 4. <IF> wishes were Beach Boys...

...then no one would wish for anything again, ever. Okay, that was maybe a bit harsh. Maybe. Someone shake Brian awake, and we can move onto the final stages of our NBest application. Also, remember the beach bunnies dig guys with a solid grasp of NBest, so let's not tarry.

So what does our completed code look like? Let's take a peek and see:


<?xml version="1.0" encoding="UTF-8" ?>
<vxml version="2.1">
<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />


<var name="resultLen"/>
<var name="resultArray"/>
<var name="myArray"/>

<form id="F1">

  <field name="F_1" slot="F_1">
    <prompt bargein="true">
      what the heck were the beach boys talking about in
        that song? I bet you don't know either.
    </prompt>


    <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

    <catch event="nomatch noinput">Sorry, I didn't get that.
      <reprompt/>
    </catch>

    <filled>

      <assign name="resultArray" expr="application.lastresult$"/>
      <assign name="resultLen" expr="resultArray.length"/>
      <goto nextitem="Bool_0"/>
    </filled>


  </field>

  <block name="B_1">
  <scrip language= javascript>
    <![CDATA[
        //Define the parameters of our function
        function deleteElement(array, n) {

        //Set the length of the array equal to a variable called "Length"
        var length = array.length;

        //If our count is greater or equal to the array length or if it is equal to zero, do nothing
        if (n >= length || n<0)
          return;

        //Loop from 'n' to the array length minus one, incrementing the value of 'n' each time
        for (var i=n; i<length-1; i++)

          //The value of the array element (i) becomes the value of the element ahead of it
          array[i] = array[i+1];

        //When the loop is done, shorten the array length which will remove the last element and sets it to "undefined"
        array.length--;
      }

      //Calls the function
      deleteElement(myArray, 0);
    ]]>
  </script>
  </block>

  <field name="Bool_0" type="boolean">
    <prompt>did you say  <value expr="resultArray[0].utterance"/>  ? </prompt>

    <filled>
      <if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">
        <clear namelist ="Bool_0 F_1"/>
        <prompt>
          I am having trouble getting your response, mushmouth.
          lets start over, shall we?
        </prompt>
        <goto next="#F1"/>
      </if>


      <if cond="Bool_0==true">
        <prompt>
        Excellent, Brian Wilson will be proud.
        When he regains consciousness, that is.
        </prompt>

      <else/>
        <clear namelist="Bool_0"/>
        <goto nextitem="B_1"/>
      </if>

    </filled>
  </field>
</form>

<form id="F2">

  <block>
    <prompt>
      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of Janet Reno.
  </prompt>
    <goto next="#F1"/>
  </block>

</form>
</vxml>




Oh yes, we are now seriously enabled VoiceXML coders. We can use our newly acquired NBest skills to determine accurate matches from just about any grammar. The world is your oyster...


Download the Code!

  Source Code


What we covered:




  ANNOTATIONS: EXISTING POSTS
bfoster63
7/21/2004 10:06 AM (EDT)
I get an "internal error" message when trying this code over the phone.
MattHenry
7/21/2004 2:33 PM (EDT)
Hi there,

Sorry about that. The problem lies with the fact that i didn't encode some operators within a conditional statement:

<if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">

should be using the '&amp;' operators instead.


I'll see that this is corrected as soon as I can get to it; thanks for the heads-up.


~Matt


henryanelson
3/22/2006 12:23 PM (EST)
I simplified the java script.


<?xml version="1.0" encoding="UTF-8" ?>


<vxml version="2.1">

  <meta name="maintainer" content="YOUREMAILADDRESS@HERE.com"/>

  <property name="maxnbest" value="5"/>
  <property name="com.nuance.rec.DoNBest" value="1" />


  <var name="resultLen"/>
  <var name="resultArray"/>
  <var name="currIndex"/>

<form id="F1">

  <field name="F_1" slot="F_1">
    <prompt bargein="true">
      what the heck were the beach boys talking about in
        that song? i bet you dont know either.
    </prompt>

    <property name="maxnbest" value="5"/>

    <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

    <catch event="nomatch noinput">Sorry, I didn't get that.
      <reprompt/>
    </catch>

    <filled>

      <assign name="resultArray" expr="application.lastresult$"/>
      <assign name="resultLen" expr="resultArray.length"/>
      <assign name="currIndex" expr="1"/>
      <goto nextitem="Bool_0"/>

    </filled>


  </field>

  <block name="B_1">
    <script>
      <![CDATA[
        currIndex++;
      ]]>
    </script>
  </block>

  <field name="Bool_0" type="boolean">
    <prompt>
    did you say  <value expr="resultArray[currIndex-1].utterance"/>  ?
    </prompt>

    <filled>
      <if cond="(Bool_0 == false) &amp;&amp; (currIndex >= resultArray.length)">
        <clear namelist ="Bool_0 F_1"/>
        <prompt>
          i am having trouble getting your response, mushmouth.
          lets start over, shall we?
        </prompt>
        <goto next="#F1"/>
      </if>


      <if cond="Bool_0==true">
        <prompt>
        Excellent, Brian Wilson will be proud.
        When he regains conciousness, that is.
        </prompt>

      <else/>
        <clear namelist="Bool_0"/>
        <goto nextitem="B_1"/>
      </if>

    </filled>
  </field>
</form>

<form id="F2">

  <block>
    <prompt>
      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of janet reno.
  </prompt>
    <goto next="#F1"/>
  </block>

</form>
</vxml>


MattHenry
3/23/2006 10:27 PM (EST)


Hiya Henry,

Thanks man, you saved me some work. I had been meaning to retool the clunky JS in that tutorial for quite awhile, and just never got around to it. When we retool the docs for the Prophecy platform, I'll be sure to include your Code Stylings into the next-gen tutorial.

Cheers!

~Matthew Henry
hertanto
7/1/2008 8:57 PM (EDT)
How do we get 'goose poop', 'doo scoop', 'boot scoop', 'blue sloop'?
I want to be able to say those in the end instead of just "      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of janet reno."
voxeoJeffK
7/1/2008 11:29 PM (EDT)
Hi Hertanto,

I hope I'm understanding your question properly. Do you want to output the utterance that matched? For instance if the caller says 'blue sloop' and it is recognized within the active grammar then that is captured in the variable:

application.lastresult$.utterance

which is available to you for output. If I have misunderstood you please provide me with a few more details on what you are trying to accomplish, and I will do my best to help.

Regards,
Jeff Kustermann
Voxeo Support
brunoalex
8/4/2008 9:37 AM (EDT)
Hi Jeff,
          Thanks a lot for the tutorial. I am wondering if I have
a richer interpretation with more than one tag, how can I offer the user one of the specific interpretation results instead of the
utterance. For instance, i might have two movies with the same name
and distinct years of filming(1945 and 1985) and I would like to give him the year for choice. I know I could do this
distinction elsewhere, but for specific reasons I need to keep at the grammar level.
Thanks a lot.
Bruno
voxeojeremyr
8/4/2008 12:07 PM (EDT)
Hi Bruno,

This certainly could be done.  What you would want to do is to use the interpretation instead of the utterance.  For example, I took the code and logged the following:
<log expr="'interpretation value is: ' + resultArray[0].interpretation.F_1"/>

You would still need to have a conditional to ask for the year if the array value is greater than one.  If you wanted to capture it all in one utterance you can use a mixed initiative grammar.  We have a good tutorial of that here:
http://docs.voxeo.com/voicexml/2.0/t_20.htm

Thanks,
Jeremy Richmond
Voxeo Support
mtatum111
9/24/2008 10:55 AM (EDT)
can you verify something for me?
I am wondering if the index starts at 0 or 1 for the following
application.lastresult$

So for instance
application.lastresult$confidence[0]
or
application.lastresult$confidence[1]
which one will get the confidence of the first element.

I am thinking the index starts at 0, but just want to make sure.

Thanks in advance!
mtatum111
9/24/2008 10:57 AM (EDT)
Oops, I forgot the period.

It should be
application.lastresult$.confidence[0]

vs.

application.lastresult$.confidence[1]
VoxeoDustin
9/24/2008 11:00 AM (EDT)
Hey Melissa,

This is an ECMAScript array, so the index will always start at 0 for any shadow variables in VoiceXML.

Cheers,
Dustin
declanh
11/26/2008 10:09 AM (EST)
Hi,
I'm hoping you can help me out.
I'm using dynamic grammars with some preset values (caller details returned from a DB lookup based on an earlier reco session) to differentiate in my application between a random utterence and one that matches the preset values. I do this by specifying the preset values in the grammar to return to application.lastresult$[0].interpretation.slot instead of just application.lastresult$[0].interpretation.
My problem is in the VXML, if I try to check if the .slot is filled, I get a semantic error (Object does not exist) if the caller utterence does not match one of the preset values. Which makes sense as it does not exist when the grammar has returned an interpretation other than one of the presets.
My application needs this so we can take a different path if we have the .slot preset value.

<assign name="PresetResult"=expr="application.lastresult$[0].interpretation.slot"/>
<assign name="PresetResultLen" expr="PresetResult.length"/>
<if cond = "PresetResultLen== 1" >
******DO SOME STUFF******
</if>
<elseif>
<assign name="OtherResult"=expr="application.lastresult$[0].interpretation"/>
</elseif>
Is there a way to test if the application.lastresult$[0].interpretation.slot object exists?
Possibly using some Javascript?
Thanks in advance for any comments,
Declan
P.S. Love the tutorials :-)
MattHenry
11/26/2008 11:00 AM (EST)


Hi there Decalan,

I'd think that one could do a simple check of the string length would suffice in this case, where we have the variable declared at the application, or document scope, and we assign the lastresult$ array postion to this upon it being filled. It seems like you might already be doing something similar, so it might be that I am not fully understanding what you are attempting (all we have is the snippet provided), so in the event that this suggestion doesn't pan out, then perhaps you can provide us with a simple test case that better illustrates the scenario.

~Matthew Henry
declanh
12/5/2008 7:12 AM (EST)
Hi Matthew,
The issue I'm experiencing is if I try to check the length of the "slot" it fails with a semantic error if this is not filled.
(Actually, its the assign step that fails for that).
This part works for both return types:
<assign name="resultArray" expr="application.lastresult$"/>
<assign name="resultLen" expr="resultArray.length"/>

while this fails if the slot is empty:
<assign name="PresetResult" expr="application.lastresult$[0].interpretation.slot"/>
<assign name="PresetResultLen" expr="PresetResult.length"/>

I do this by specifying the preset values in the grammar to return to application.lastresult$[0].interpretation.slot instead of just application.lastresult$[0].interpretation.
Here's a snippit from the top of the grammar for illustration:
<rule id="r1">
<one-of>
<item repeat="0-2" repeat-prob=".2">
<ruleref uri="#preset"/>
</item>
<item repeat="0-2" repeat-prob=".8">
<ruleref uri="#number"/>
</item>
</one-of>
</rule>
<rule id="preset">
<item repeat="0-1"><ruleref uri="#fill_phone"/></item>
<item>
<item>
<ruleref uri="#preset_num"/>
<tag>assign(p1 $return)</tag>
</item>
<tag>
<![CDATA[<_value $p1>]]>
</tag>
</item>
</rule>
<rule id="preset_num">
<one-of>
<item>
X X X X X X X X X X
<tag>return (presetX)</tag>
</item>
<item><item>"045864393"</item><item repeat="0-1"><item repeat="0-1">"that's"</item></item><tag><slot"045864393"></tag></item></one-of>
</rule>

I'm currently getting around this by catching the error event and assuming that the slot was not filled in that case. The I proceed to
Obviously thats a horrible hack! (See below)

    <filled>
<assign name="resultArray" expr="application.lastresult$"/>
<assign name="resultLen" expr="resultArray.length"/>

<assign name="PresetResult" expr="application.lastresult$[0].interpretation.slot"/>
<assign name="PresetResultLen" expr="PresetResult.length"/>   
<if cond = "PresetResultLen > 0" >
    <submit namelist="PresetResultReceived PresetResult"/>
    </if>
    </filled>
    </field>
    <error count='1'>
<goto nextitem="Normalresults"/>
    </error>
VoxeoDustin
12/5/2008 4:28 PM (EST)
Hey Declan,

One way to handle this more gracefully may be to simply check to see if the variable is defined by using the [b]typeof[/b] function. This should avoid the semantic error:

<if cond="(typeof myVar == 'undefined')">

Let me know if this is helpful.

Cheers,
Dustin
declanh
12/15/2008 9:58 AM (EST)
Hi Dustin,
That did the trick perfectly.
Thanks very much,
Declan
srini_v
10/14/2010 12:41 PM (EDT)
Hi,

If we get multiple interpretations as a result of the maxnbest property, We access the results like application.lastresult$[2].interpretation, same way can we access the results using fieldname$[2].interpretation, fieldname$[2].confidence.

Thanks in advance!

Regards,
Srini
VoxeoDante
10/14/2010 3:13 PM (EDT)
Hello,

The array results will not be stored in the field name in the same way.  Is there some reason that the application variable containing the results will not work for you?  If so, please let us know, and we can help you come up with a solution that will work.

As always, we are standing by to assist.

Regards,
Dante Vitulano
Customer Engineer II
Voxeo Corp.

Have you visited our Voxeo Developers Corner for tips, tricks and tutorials about developing applications on Voxeo platforms?

http://blogs.voxeo.com/voxeodeveloperscorner/
srini_v
10/15/2010 5:30 AM (EDT)
Hi Dante,

  Thanks for the support. Every thing is alright with the application variables.

Thanks,
Srini

login
  tutorial Shadow Variables  |  TOC  |  tutorial Outbound VoiceXML Applications via HTTP  

© 2013 Voxeo Corporation  |  Voxeo IVR  |  VoiceXML & CCXML IVR Developer Site