VoiceXML 2.1 Development Guide Home  |  Frameset Home

  tutorial Shadow Variables  |  TOC  |  tutorial Outbound VoiceXML Applications via HTTP  

Tutorial: NBest Post-Processing

This tutorial is meant for advanced VoiceXML coders, and is based on concepts learned from previous tutorials. If you have not completed the preceding tutorials, go back and work them now, before you swim too far...there isn't a lifeguard on duty. A base knowledge of javascript, specifically array, and loop constructs, is also recommended for completion of this tutorial.

In this tutorial, we will:

Step 1: Why use an NBest list?

Sometimes in our VoiceXML applications, we will find we cannot get a grammar as accurate as we would like, especially if we have complex, multi-worded utterances in our grammar that are phonetically similar. For illustrative purposes, we will debunk a snippet of a pop song lyric interpretation gone horribly, horribly awry. Assume our grammar looks like this:


  <![CDATA[
    [
      (big old jet airliner)        { <F_1 "Steve Miller"> }
      (bingo jet said right on)    { <F_1 "Bingo">        }
      (jango fett left a light on) { <F_1 "Jango">        }
      (big old bet on a rhino)    { <F_1 "Rhino">        }
    ]
  ]]>


Chances are, upon an utterance of "big old jet airliner," the VoiceXML interpreter will become confused and serve up whichever one it thinks is best, regardless of whether or not it was the correct match, and then move on to the next action in our script. Frustrating, eh?

Good thing for us there is a technique known as 'NBest searching' in spoken language processing; this technique can find all the utterance matches that sound similar and group them together in an array of values, that is programmatically available to the developer, (you). This search through possible correct matches is based on a number of things, such as word graphs, confidence scoring, and word lattices of a spoken utterance. Crazy sounding jibba-jabba? You bet it is! But, we don't need to know the difference between a word lattice and a bag of sand crabs. We simply need to add in simple client-side JavaScript.

Once we have mastered Nbest lists, you will find it an extraordinarily powerful tool to use, especially in such cases as when constructing an alphanumeric grammar, or using it in a large company directory grammar, when we might have several similar sounding names.

Step 2: Authoring our initial VoiceXML file

If we are going to be using NBest post processing, we are need a form-field-grammar combo as a starting point to get the ball rolling. We also want to write a marginally ambiguous grammar, giving us several possible matches for the same utterance -- so we will deliberately try to confuse the VoiceXML interpreter! And what better way to confuse anything, or anyone than by using Brian Wilson and the Beach Boys as an example. The surf's up, so lets define our grammar file and catch us a wave, (or a beach bunny):


SONG [
    (little deuce coop)    { <F_1 "beach boys"> }
    (little goose poop)    { <F_1 "goose poop"> }
    (little doo scoop)      { <F_1 "doo scoop">  }
    (little boot scoop)    { <F_1 "boot scoop"> }
    (little blue sloop)    { <F_1 "blue sloop"> }
]


Next, our XML file itself:


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />



  <var name="resultLen"/>
  <var name="resultArray"/>
  <var name="myArray"/>

  <form id="F1">
    <field name="F_1" slot="F_1">
      <prompt bargein="true">
        what the heck were the beach boys talking about in
        that song? i bet you dont know either.
      </prompt>
     
      <grammar src="Lyrics.grammar#SONG"/>

      <filled>
      </filled>
    </field>

    <block name="B_1">
      <script language="Javascript">
      </script>

      <field name="Bool_0" type="boolean">
        <prompt>
        </prompt>
        <filled>
        </filled>
      </field>
    </block>
  </form>
</vxml>


All of this should seem pretty familiar by now, with the exception of just a few things. We have added a field-level <property> designating that we do want to enable Nbesting, and that the maximum amount of possible matches for our NBest list to cycle through is '5'. We have also declared a few empty variables for use later on -- these will hold the values for length of our array and the actual value of the array itself, respectively. We also put in a <block> to hold our post-processing script (it will delete unwanted utterances) and a boolean <field> grammar to prompt the user to confirm his utterance before moving on. Fear not, we will fill in all the blanks, and have a complete script before the sundown beach-blanket bingo party.


In plain English, what we are doing is finding out how many possible matches we have for an utterance. If we have but one match, then that is just fine as paint -- we won't need any NBest processing at all. Ahhh, but if we have more than one match, we will group them together in an array and return to our regularly scheduled VoiceXML program. But wait, we still have a bit more javascript to add within our <block>, remember? Let's skip right on down and see what this is all about:


<block name="B_1">
  <script language="javascript">
    <![CDATA[
      function deleteElement(array, n) {
        // DEFINE THE PARAMETERS OF OUR FUNCTION

        var length = array.length;
        // SET THE LENGTH OF THE ARRAY EQUAL
        // TO A VARIABLE CALLED 'LENGTH'

        if (n >= length || n<0)
          // IF OUR COUNT IS GREATER OR EQUAL TO THE ARRAY
          // LENGTH OR IF IT IS EQUAL TO ZERO, DO NOTHING
          return;

        for (var i=n; i<length-1; i++)
          // LOOP FROM 'n' TO THE ARRAY LENGTH MINUS 1
          // INCREMENTING THE VALUE OF 'n' EACH TIME

          array[i] = array[i+1];
          // THE VALUE OF THE ARRAY ELEMENT [i] BECOMES
          // THE VALUE OF THE ELEMENT AHEAD OF IT
          // EX. array[1] = array[2]

        array.length--;
        // WHEN THE LOOP IS DONE, SHORTEN THE ARRAY LENGTH
        // WHICH REMOVES THE LAST ELEMENT AND SETS IT
        // TO "undefined"
      }
     
      deleteElement(myArray, 0);
      // FINALLY, CALL THE FUNCTION
    ]]>
  </script>
</block>


The skinny on the above script is that we set up some conditionals to relegate what we return to VoiceXML if we run out of possible matches. Once we have this safeguard in place, we will set up the "guts" of our function, deleting the first array element and moving all of the remaining elements up a notch, (i.e, array element one becomes array element zero, array element two becomes array element one, etc). We will then control the application flow so we only hit this block of javascript when our caller specifies a returned utterance is incorrect. Let's complete our boolean field to flesh this out, and plug in our nifty new JavaScript code, sans the wordy comments from the peanut gallery.

Step 3. Catching some (Ar)rays

Okay, so here is our half-finished code, all laid out with our JavaScript nested in the appropriate snuggly places. Of course, we snuck a few extra things in there as well, (just cuz).


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">

<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />


  <var name="resultLen"/>
  <var name="resultArray"/>

  <form id="F1">
    <field name="F_1" slot="F_1">
      <prompt bargein="true">
        what the heck were the beach boys talking about in
        that song? i bet you dont know either.
      </prompt>
     
      <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

      <catch event="nomatch noinput">Sorry, I didn't get that.
        <reprompt/>
      </catch>
     
      <filled>
        <assign name="resultArray" expr="application.lastresult$"/>
        <assign name="resultLen" expr="resultArray.length"/>
        <goto nextitem="Bool_0"/>
      </filled>
    </field>

    <block name="B_1">
      <script language="Javascript">
        <![CDATA[
          function deleteElement(array, n) {
            var length = array.length;
            if (n >= length || n<0)                 
              return;

            for (var i=n; i<length-1; i++)     
              array[i] = array[i+1];             

            array.length--;               
          }
 
          deleteElement(myArray, 0);
        ]]>
      </script>

      <field name="Bool_0" type="boolean">
        <prompt>
        </prompt>
        <filled>
          <if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">
            <clear namelist ="Bool_0 F_1"/>
            <prompt>
              i am having trouble getting your response, mushmouth.
              lets start over, shall we?
            </prompt>
            <goto next="#F1"/>
          </if>
                <if cond="Bool_0==true">
            <prompt>
            </prompt>
          <else/>
          <clear namelist="Bool_0"/>
          <goto nextitem="B_1"/>
          </if>
      </filled>
      </field>
    </block>
  </form>
  <form id="F2">
    <block>
      <prompt>
      </prompt>
      <goto next="#F1"/>
    </block>
  </form>
</vxml>


Notice that we assign the variable named "resultArray" to the value of our shadow variable "application.lastresult$" (you did read Lesson 13 before attempting this code, right?). This is an important step, as this variable now holds the entire array of values filled upon the caller's utterance. We are also going to need to find out the length of the array, so we can determine just how many valid matches we have for a particular utterance; thus, the assignment of "resultLen" into a variable.

Our last <field> construct is meant to query the user and confirm the caller's original statement, correct? If our array length of possible matches is equal to one, and the caller replies with a negative answer, we are going to want to send the caller back to the beginning, as this will programmatically indicate that there were no other matches left.

In order to handle this, we have added a conditional statement based on our "resultArray.length", and cleared out the namespace for the boolean field so we may visit it again once the initial grammar choice is made. Of course, we still need to explicitly tell the interpreter what to do next, so we add a goto statement directing the application flow back to our very first field.

Still with us? Brian Wilson sure isn't, but if you have followed along this far, we still have just a little bit of work left to do, and then it's "fun, fun, fun." Until, of course, your Daddy takes the T-Bird away.


Step 4. <IF> wishes were Beach Boys...

...then no one would wish for anything again, ever. Hokay, that was maybe a bit harsh. Maybe. Someone shake Brian awake, and we can move onto the final stages of our NBest application. And remember, the beach bunnies dig guys with a solid grasp of NBest, so let's not tarry.

So what does our completed code look like, O aspirant of knowledge? Let's take a peek and see:


<?xml version="1.0" encoding="UTF-8" ?>
<vxml version="2.1">
<meta name="maintainer" content="yourEmail@here.com"/>

<property name="maxnbest" value="5"/>
<property name="com.nuance.rec.DoNBest" value="1" />


<var name="resultLen"/>
<var name="resultArray"/>
<var name="myArray"/>

<form id="F1">

  <field name="F_1" slot="F_1">
    <prompt bargein="true">
      what the heck were the beach boys talking about in
        that song? i bet you dont know either.
    </prompt>


    <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

    <catch event="nomatch noinput">Sorry, I didn't get that.
      <reprompt/>
    </catch>

    <filled>

      <assign name="resultArray" expr="application.lastresult$"/>
      <assign name="resultLen" expr="resultArray.length"/>
      <goto nextitem="Bool_0"/>
    </filled>


  </field>

  <block name="B_1">
    <script>
      <![CDATA[
        function deleteElement(array, n) {
        var length = array.length;
        if (n >= length || n<0)                 
        return;

        for (var i=n; i<length-1; i++)     
        array[i] = array[i+1];             

        array.length--;
                           
                                    }
        deleteElement(resultArray, 0);
      ]]>
    </script>
  </block>

  <field name="Bool_0" type="boolean">
    <prompt>did you say  <value expr="resultArray[0].utterance"/>  ? </prompt>

    <filled>
      <if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">
        <clear namelist ="Bool_0 F_1"/>
        <prompt>
          i am having trouble getting your response, mushmouth.
          lets start over, shall we?
        </prompt>
        <goto next="#F1"/>
      </if>


      <if cond="Bool_0==true">
        <prompt>
        Excellent, Brian Wilson will be proud.
        When he regains conciousness, that is.
        </prompt>

      <else/>
        <clear namelist="Bool_0"/>
        <goto nextitem="B_1"/>
      </if>

    </filled>
  </field>
</form>

<form id="F2">

  <block>
    <prompt>
      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of janet reno.
  </prompt>
    <goto next="#F1"/>
  </block>

</form>
</vxml>




Oh yes, we are now seriously enabled VoiceXML coders. We can use our newly aquired NBest skills to determine accurate matches from just about any grammar. The world is your oyster...


Download the Code!

  Motorola Source Code


What we covered:







  ANNOTATIONS: EXISTING POSTS
bfoster63
7/21/2004 10:06 AM (EDT)
I get an "internal error" message when trying this code over the phone.
MattHenry
7/21/2004 2:33 PM (EDT)
Hi there,

Sorry about that. The problem lies with the fact that i didn't encode some operators within a conditional statement:

<if cond="(Bool_0 == false) &amp;&amp; (resultArray.length == 1)">

should be using the '&amp;' operators instead.


I'll see that this is corrected as soon as I can get to it; thanks for the heads-up.


~Matt


henryanelson
3/22/2006 12:23 PM (EST)
I simplified the java script.


<?xml version="1.0" encoding="UTF-8" ?>


<vxml version="2.1">

  <meta name="maintainer" content="YOUREMAILADDRESS@HERE.com"/>

  <property name="maxnbest" value="5"/>
  <property name="com.nuance.rec.DoNBest" value="1" />


  <var name="resultLen"/>
  <var name="resultArray"/>
  <var name="currIndex"/>

<form id="F1">

  <field name="F_1" slot="F_1">
    <prompt bargein="true">
      what the heck were the beach boys talking about in
        that song? i bet you dont know either.
    </prompt>

    <property name="maxnbest" value="5"/>

    <grammar src="Lyrics.grammar#SONG" type="text/gsl"/>

    <catch event="nomatch noinput">Sorry, I didn't get that.
      <reprompt/>
    </catch>

    <filled>

      <assign name="resultArray" expr="application.lastresult$"/>
      <assign name="resultLen" expr="resultArray.length"/>
      <assign name="currIndex" expr="1"/>
      <goto nextitem="Bool_0"/>

    </filled>


  </field>

  <block name="B_1">
    <script>
      <![CDATA[
        currIndex++;
      ]]>
    </script>
  </block>

  <field name="Bool_0" type="boolean">
    <prompt>
    did you say  <value expr="resultArray[currIndex-1].utterance"/>  ?
    </prompt>

    <filled>
      <if cond="(Bool_0 == false) &amp;&amp; (currIndex >= resultArray.length)">
        <clear namelist ="Bool_0 F_1"/>
        <prompt>
          i am having trouble getting your response, mushmouth.
          lets start over, shall we?
        </prompt>
        <goto next="#F1"/>
      </if>


      <if cond="Bool_0==true">
        <prompt>
        Excellent, Brian Wilson will be proud.
        When he regains conciousness, that is.
        </prompt>

      <else/>
        <clear namelist="Bool_0"/>
        <goto nextitem="B_1"/>
      </if>

    </filled>
  </field>
</form>

<form id="F2">

  <block>
    <prompt>
      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of janet reno.
  </prompt>
    <goto next="#F1"/>
  </block>

</form>
</vxml>


MattHenry
3/23/2006 10:27 PM (EST)


Hiya Henry,

Thanks man, you saved me some work. I had been meaning to retool the clunky JS in that tutorial for quite awhile, and just never got around to it. When we retool the docs for the Prophecy platform, I'll be sure to include your Code Stylings into the next-gen tutorial.

Cheers!

~Matthew Henry
hertanto
7/1/2008 8:57 PM (EDT)
How do we get 'goose poop', 'doo scoop', 'boot scoop', 'blue sloop'?
I want to be able to say those in the end instead of just "      You are now free to turn your thoughts to less confusing
      channels, such as attempting to discern the true gender
      of janet reno."
voxeoJeffK
7/1/2008 11:29 PM (EDT)
Hi Hertanto,

I hope I'm understanding your question properly. Do you want to output the utterance that matched? For instance if the caller says 'blue sloop' and it is recognized within the active grammar then that is captured in the variable:

application.lastresult$.utterance

which is available to you for output. If I have misunderstood you please provide me with a few more details on what you are trying to accomplish, and I will do my best to help.

Regards,
Jeff Kustermann
Voxeo Support

login

  tutorial Shadow Variables  |  TOC  |  tutorial Outbound VoiceXML Applications via HTTP  

© 2003-2008 Voxeo Corporation  |  Voxeo IVR  |  VoiceXML & CCXML IVR Developer Site