Sunday, September 23, 2007

Voice Extensible Markup Language (VoiceXML)

Refered BY : http://www.w3.org/TR/voicexml20/

Goals of VoiceXML
VoiceXML's main goal is to bring the full power of Web development and content delivery to voice response applications, and to free the authors of such applications from low-level programming and resource management. It enables integration of voice services with data services using the familiar client-server paradigm. A voice service is viewed as a sequence of interaction dialogs between a user and an implementation platform. The dialogs are provided by document servers, which may be external to the implementation platform. Document servers maintain overall service logic, perform database and legacy system operations, and produce dialogs. A VoiceXML document specifies each interaction dialog to be conducted by a VoiceXML interpreter. User input affects dialog interpretation and is collected into requests submitted to a document server. The document server replies with another VoiceXML document to continue the user's session with other dialogs.
VoiceXML is a markup language that:
Minimizes client/server interactions by specifying multiple interactions per document.
Shields application authors from low-level, and platform-specific details.
Separates user interaction code (in VoiceXML) from service logic (e.g. CGI scripts).
Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers.
Is easy to use for simple interactions, and yet provides language features to support complex dialogs.
While VoiceXML strives to accommodate the requirements of a majority of voice response services, services with stringent requirements may best be served by dedicated applications that employ a finer level of control.
Scope of VoiceXML
The language describes the human-machine interaction provided by voice response systems, which includes:
Output of synthesized speech (text-to-speech).
Output of audio files.
Recognition of spoken input.
Recognition of DTMF input.
Recording of spoken input.
Control of dialog flow.
Telephony features such as call transfer and disconnect.
The language provides means for collecting character and/or spoken input, assigning the input results to document-defined request variables, and making decisions that affect the interpretation of documents written in the language. A document may be linked to other documents through Universal Resource Identifiers (URIs).
Principles of Design
VoiceXML is an XML application [XML].
The language promotes portability of services through abstraction of platform resources.
The language accommodates platform diversity in supported audio file formats, speech grammar formats, and URI schemes. While producers of platforms may support various grammar formats the language requires a common grammar format, namely the XML Form of the W3C Speech Recognition Grammar Specification [SRGS], to facilitate interoperability. Similarly, while various audio formats for playback and recording may be supported, the audio formats described in Appendix E must be supported
The language supports ease of authoring for common types of interactions.
The language has well-defined semantics that preserves the author's intent regarding the behavior of interactions with the user. Client heuristics are not required to determine document element interpretation.
The language recognizes semantic interpretations from grammars and makes this information available to the application.
The language has a control flow mechanism.
The language enables a separation of service logic from interaction behavior.
It is not intended for intensive computation, database operations, or legacy system operations. These are assumed to be handled by resources outside the document interpreter, e.g. a document server.
General service logic, state management, dialog generation, and dialog sequencing are assumed to reside outside the document interpreter.
The language provides ways to link documents using URIs, and also to submit data to server scripts using URIs.
VoiceXML provides ways to identify exactly which data to submit to the server, and which HTTP method (GET or POST) to use in the submittal.
The language does not require document authors to explicitly allocate and deallocate dialog resources, or deal with concurrency. Resource allocation and concurrent threads of control are to be handled by the implementation platform