VoiceXML 2.1 Development Guide Home  |  Frameset Home

  tutorial JavaScript and VXML  |  TOC  |  tutorial GSL Weighting  

Tutorial: GSL Subgrammars -- Chariot of the Gods?

This tutorial is based on concepts you accomplished in Tutorials 1, 2, and 3. If you have not completed these tutorials, you will need to go through them first.


Step 1: creating our initial VoiceXML structure


From toiling through previous tutorials, we have learned to instantiate the VoiceXML declaration as a matter of course. We have also learned how important it is to provide a <meta maintainer> declaration for debugging purposes:


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<meta name="maintainer" content="yourEmail@here.com"/>
</vxml>



Step 2: Adding the voice recognition field

If we want to use a grammar, we should probably add a field construct so we can use our snazzy subgrammar knowledge. We have seen this sort of thing before in our previous tutorials. But wait, we need a theme, don't we? Tutorials just aren't as much fun to work on unless we can keep ourselves amused while doing it. So, let's arrange a pact with the B-Movie Actor's Guild, and structure our learning experience around the Great, (and not-so-Great) washed-up B-movie actors from the days of Yore. Our application will revolve around a caller voting for his favorite B-movie actor by providing a first, middle, and last name to the application....a tasty, and yet workable theme for the concept of a 3-stage subgrammar.


<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<meta name="maintainer" content="yourEmail@here.com"/>

<form id="Form1">
  <field name="F_1">
  <grammar src="MySubGrammar.gsl#FULLNAME" type="text/gsl"/>
  <prompt> Who is the greatest B movie actor of all time? </prompt>
  <filled namelist="F_1">
    <prompt>
      wow. who would have thought that <value expr="F_1"/> actually had a fan base? 
    </prompt>
  </filled>
  </field>
</form>
</vxml>


Step 3: Constructing the subgrammar modules

As we are notoriously eminent VoiceXML coders by now, we know we can construct a complex grammar file inline (by using CDATA to enclose the non-parseable characters), or as an external "grammar file." For this example, let's keep it simple by coding our grammar in an external fashion. First, we have to start off by creating separate rulenames for our seperate utterance categories, (first, middle, and last name):


FIRSTNAME
[
[elvis]  {return("Elvis ")}
[george] {return("George ")}
[elisha] {return("Elisha ")}
[mister] {return("Mr. ")}
]

MIDDLENAME
[
[aaron]  {return("Aaron ")}
[see]     {return("C. ")}
[cook]    {return("Cook ")}
]

LASTNAME
[
[presley]  {return("Presley ")}
[scott]    {return("Scott ")}
[junior]    {return("Jr. ")}
[tee]      {return("T ")}
]


Step 4: Assigning grammar slots

Okay, so now we have three rulenames defining three separate grammars. Of note is the fact that we are not using the field or slot name in the return statement. Instead, we use the "return" command, a technique huge in Europe. Essentially, this is just a simple way of returning all of our matches to a single "Top Level Rulename." Nummy. This "Top Level" rule will serve to tie all these independent rule utterances together, and will also define those grammar slots that we want to return to our <field> upon a sucessful recognition event.  Let's dig into that topic now, shall we?


NAME [
(FIRSTNAME:a ?MIDDLENAME:b LASTNAME:c)

{return(strcat($a strcat($b $c)))}  ]


We have stacked another grammar construct on top of our existing set of three, bearing an uncanny resemblance to 6th Century Punic. But wait a second, at least some of it looks familiar; this "NAME" Rulename encloses all of our other subgrammars and seems to tack on some crazy alphabetical jive talk. Lets break it down so we don't have to fake the funk:


NAME  [     
(FIRSTNAME:a  ?MIDDLENAME:b  LASTNAME:c)
..
]


This assigns the results of the FIRSTNAME grammar to the grammar slot "a", MIDDLENAME to the grammar slot "b", and LASTNAME to the grammar slot "c." Note, we insert the "?" operator into the grammar, making this grammar optional in case our B-movie actor is lacking in this regard. Breaking multiple grammars into multiple slots is neccessary, as we will want to concatenate the results into one grammar slot later on. While grammar slot definitions can be named whatever the heck we want, for purposes of simplicity, we are waxing alphabetical.


{return(strcat($a strcat($b $c)))}


Now, we come to the grammar return, where things get a little tricky. Here we will break out the "strcat" function, a cute little bit of syntax allowing us to, (you guessed it), concatenate the return values from slots 'a', 'b', and 'c' into one tidy string, and then returning it to the Top Level Rulename formatted and ready for playback via text-to-speech.

Step 5: Tying it all together

Now that we have all of our grammar compnents in place, all that remains is to add a Top Level Rulename to bring them all together. We can then send this singular return value back to our VoiceXML code:


FULLNAME [
NAME:d {<F_1 $d>}
]

NAME [
(FIRSTNAME:a ?MIDDLENAME:b LASTNAME:c)
{return(strcat($a strcat($b $c)))}
]

FIRSTNAME [
[elvis] {return("elvis ")}
[george] {return("George ")}
[elisha] {return("Elisha ")}
[mister] {return("Mr. ")}
]

MIDDLENAME  [
[aaron]  {return("Aaron ")}
[see] {return("C. ")}
[cook]  {return("Cook ")}
]

LASTNAME [
[presley]  {return("Presley ")}
[scott]    {return("Scott ")}
[junior]  {return("Jr. ")}
[tee]      {return("T ")}
]


Our Top Level Rulename, which is the rule that was referenced long ago in our grammar src, encloses our NAME rule, (which, if you remember, encloses all of our subgrammars). But wait, what is the wacky "$d" thing?

This is simply a form of shorthand for grabbing and returning the results of our concatenated slot value to our originating code. Note: the NAME slot must also match up with the return slot in order to link these two entities together. Now all we have to do is to upload our VoiceXML code and grammar file in order to give Mr. T the recognition he so rightly deserves.

Download the Code!

  Motorola source code


What we covered:




  ANNOTATIONS: EXISTING POSTS
moshe
6/7/2004 6:22 PM (EDT)
In order for the "name slot to match up with the return slot", the rule has to start like this:

.FULLNAME [
NAME:d {<F1 $d>}
]

and not <MyName $d> as in the example.

(And if I'm wrong, could we get an explanation of how the name stot matches up with the return slot?)
eosmann
11/1/2004 6:52 PM (EST)
It does appear that "NAME:d {<MyName $d>}" should read "NAME:d {<F_1 $d>}".  Correct?
Cheers.
MattHenry
11/2/2004 12:28 PM (EST)
Hi guys,

You both are as right as rain; thanks to all for the catch. The corrected version should be posted within the next few days.

~Matt
henryanelson
3/22/2006 9:27 AM (EST)
Does the F_1 reference the field name?
MattHenry
3/22/2006 2:38 PM (EST)

hello Henry,

I am assuming that you are speaking of the top level rule return in the grammar file:

FULLNAME [
NAME:d {<F_1 $d>}
]

..which does, in fact, refernce the fieldname. You might check our docs on grammar slots for additional information on this topic:

http://docs.voxeo.com/voicexml/2.0/gslslots.htm

~Matt
danielvinson
2/13/2007 5:10 PM (EST)
I am attempting to adapt the code such that it recognises a PIN ie 0-9 for between 5-8 digits.  It is proving harder than expected.  It seem to limit the number of options for each entry to 4.  Is this due to the rules?

Regards

Daniel
mikethompson
2/13/2007 5:54 PM (EST)
Hi Daniel,

You can save yourself a lot of trouble by using our builtin digits grammar for your PIN entry system.  You can check out all of our builtin grammars here:

http://docs.voxeo.com/voicexml/2.0/gslbuiltins.htm#start

Here's a quick example of using type="digits" with a maximum length of 8 and a minimum length of 5.

<field name="F_1" type="digits?minlength=5;maxlength=8">
  ...
  <filled>
    <prompt>
      your input was <value expr="F_1"/>.
    </prompt>
  </filled>
</field>

Hope this helps,
Mike Thompson
Voxeo Corporation
danielvinson
2/14/2007 3:34 AM (EST)
Hi Mike

Thanks for your reply.  What about if I wanted to recognise a postcode (ie a combination of letters and numbers) ?  I assume the digits function won't work.  What options do I have?

Daniel
jbassett
2/14/2007 7:53 AM (EST)
Hello,

Alphanumeric is *very* difficult to capture accurately....'B' sounds like 'D' sounds like '3', world without end. I'm not saying that All Hope Is Lost, simply attempting to make it clear that devising such a grammar is going to require some time devoted to testing, tuning, retesting, retuning, rinsing, repeating.


Here are some link you might find useful. If you would like to go to our support forums and open a ticket on this, I can give you a alphanumeric grammar example for Canadian zip codes.

http://docs.voxeo.com/voicexml/2.0/t_14_mot.htm
http://docs.voxeo.com/voicexml/2.0/t_20.htm
http://docs.voxeo.com/voicexml/2.0/w3cprops.htm

Thanks
Jesse Bassett
Voxeo Support
danielvinson
2/14/2007 8:43 AM (EST)
Hi Jesse

Thanks for the reply.  I read that alphanumeric is difficult before.  I assume by limiting the number of options for each character will improve the accuracy.

Forgetting about the accuracy problems for the moment do you think it is possible to adapt the above code (first, middle and last names) to ask for a postcode?  If so can you give me a hint.

Thanks

Daniel
MattHenry
2/14/2007 1:47 PM (EST)


Daniel,


The code that was attached *is* for alphanumeric zip code capture, so I am a bit confused as to your question....please advise as to what I am not understanding.

~Matt
georgelai
6/26/2007 4:22 PM (EDT)
I was just wondering... instead of using subGrammars, could I resend a voice wav file to a grammar xml file to decode different parts of the wav file?

For example, I have a voice wav file saying...

"quick brown fox jumped over the lazy dog"

and I have grammar code:


PARTOFSENTENCE
[
PART:b {<F_1 $b>}
]

PART [
(WORD:a)
{return($a)}
]

WORD
[
[quick] {return("quick ")}
[fox] {return("fox ")}
[brown] {return("brown ")}
[jumped] {return("jumped ")}
[over] {return("over")}
[lazy] {return("lazy")}
[dog] {return("dog")}
]


If there a way to run the voice wav file through the grammar check so that when the first word "quick" is recognized, then some code would "chop" the wav file and remove "quick" from the wav file (or ignore it), and run the wav again through the grammar to recognize "brown", and etc. 

So instead of:

"quick brown fox jumped"

one could potentially also say:

"brown dog jumped over lazy fox"

and both texts would be converted to text.

Maybe I am asking for an impossible approach, but this way a whole sentence could be parsed into text without having discrete positions for words.  True, it would be slow for long phrases...
voxeojeff
6/26/2007 7:04 PM (EDT)
Hello,

Technically, what you are asking for is possible, and would require a complex grammar file.  However, before we steer you down this path, we would like to know what exactly you are trying to accomplish here, in detail.  If you would be so kind, can you explain the purpose and flow of your application?

Thank you,

Jeff Menkel
Voxeo Corporation
parit
10/11/2007 6:04 AM (EDT)
hi,

I am using n-Best list along with an external grammar file and having a conceptual problem. Grammar file is as follows:

SEARCHSTRING [
STRING:a <id $a>
]

STRING [
(NAMES:b ?NAMES:c) {return(strcat($b $c))}
]

NAMES [
[hit] <return("hit")>
[sit] <return("sit")>
]

Now when I say "hit" the engine also recongnises "sit" with some probability (coz I have allowed it to have at max 5 results). In that case what should be the slot value of b and ultimately of a? Please put in ur comments.
MattHenry
10/11/2007 11:35 AM (EDT)


Hi there,

I am not sure that this grammar is going to fill slots b and c at all: it appears as if you are trying to fill multiple sub-slots from the same utterance value, and I don't think that this is going to work at all using the method that you are using.....my gut reaction is that this will throw a sizable grammar error when you attempt to run it. Perhaps you can try and give me an idea of what your end goal is in this regard so that I can better understand what you are trying to achieve here?


~Matt
mikejstein
10/17/2007 6:54 PM (EDT)
I'm trying to put a subgrammar together, that would take both text and numbers.  I've got to handle any number, so I can't use a text grammar for it.  So if somebody called up and said "BUY 25400", or "SELL 193", it'd work just the same.

I haven't seen anything on mixing grammar types in a subgrammar.  Any ideas?
MattHenry
10/17/2007 9:45 PM (EDT)


Hi there,

Assuming that you aren't trying to mix dtmf and voice input, this should be pretty do-able. A good starting point for yo would be to take a gander at the downloadable subgrammars at the below link:

http://evolution.voxeo.com/library/grammar/library.jsp

Specifically, eyeball the one titled "numbers2sixteen.grammar", as this contains all the number grammars that you need without having to reinvent the wheel. Assuming that you use the 4 digit string as your low-level rule, you'd simply define your text rule in the file, and then concatenate the results together at the top level rulename, much like we do in this humble tutorial:


MY_TOP_RULE [
MY_SUBRULE:z {<mySlotName $z>}
]

MY_SUBRULE [
(
  MYNUMBER:d1
  MYTEXT:d2
) {return(strcat($d1 $d2))}
]

Hope this helps!

~Matt
mzelik
4/4/2008 8:59 AM (EDT)
I am getting the following error when testing the downloaded code:

00142    bd50    12:49:44 AM    =========================== An error occurred while executing the following dialog. Initial URL1: http://mike.softrite.com/subG.xml Initial URL2: null Initial URL3: null Current URL: http://mike.softrite.com/subG.xml?session.callerid=9546104367&session.calledid=4079929967&session.sessionid=e6c15423f50779f419c06c1d6ad4bd50&session.parentsessionid=95b6b35aa44a6ff53bfa9d652cd607b0 Calling Number (ANI): 9546104367 Called Number (DNIS): 4079929967 Redirecting Number (RDNIS): "" State: Form1 VoiceXML Browser Version: 8.0.27086 Date/Time: 2008/4/4 12:49:45.796 VoiceException: error.badfetch.http.404 Failed fetch with code: 404 (Not Found), URL: http://mike.softrite.com/gsl/MySubGrammar.gsl Dialog stack trace: State (Dialog) URL (Document) -------------- ------------------------------ Form1 http://mike.softrite.com/subG.xml?session.callerid=9546104367&session.calledid=4079929967&session.sessionid=e6c15423f50779f419c06c1d6ad4bd50&session.parentsessionid=95b6b35aa44a6ff53bfa9d652cd607b0

The error message is from the Voxeo application debugger. I am using the downloaded code from Voxeo, except for the location of the GSL file. I changed the location to see if that was the problem, but got the same error. I know the GSL file is on the server.


I am just learning VoiceXML and vxml for that matter. I looked at the code and cannot see any errors. I used a different WEB server and received the same error.

I am stumped. Please advise.

Thanks.


voxeojeremyr
4/4/2008 9:09 AM (EDT)
Hi,

From the error, you are receiving a HTTP error code of 404, which means that the page that it is looking for cannot be found.

error.badfetch.http.404 Failed fetch with code: 404 (Not Found), URL: http://mike.softrite.com/gsl/MySubGrammar.gsl

I tried to get to that URL but I was unable to.  It may be behind a firewall which is why you are getting that error.

Thanks,
Jeremy Richmond
Voxeo Support
mzelik
4/4/2008 9:54 AM (EDT)
I have made some headway on the problem.

After trying many things, I fould that when I rename the file with an XML extension and change the code accordingly, it works!

I am not sure what is causing the problem, but this at least will allow me to continue on the training material.

Thanks for your assistance.


login

  tutorial JavaScript and VXML  |  TOC  |  tutorial GSL Weighting  

© 2003-2008 Voxeo Corporation  |  Voxeo IVR  |  VoiceXML & CCXML IVR Developer Site