Resource Library

Jargon Support, Concepts, Documentation, and More

Need Additional Support?

We're here for you and happy to answer questions. Don't hesitate to reach out!

Contact Support


A speech component, also referred to as a prompt, is a message that is verbally sent back to the user. It can be any combination of text-to-speech and audio files. In Jargon, text-to-speech content can be authored and marked with SSML tags. An audio file can be inserted into a speech component.


A reprompt is another speech component that is played if the user needs another prompt to take action. For example, if the user does not respond in a timely manner, a reprompt can clarify the initial response by providing examples for the user to respond.


A card can be used to enhance the voice experience by delivering value-add content relevant to the voice experience. Jargon supports platform-specific requirements for visuals, such as Alexa Presentation Layer (APL) and Google Directives.


Strings provide additional ways to include content and are suitable for the following use cases:

When your content doesn't fit Jargon's basic format of response components (speech, reprompt, card, etc.). In this use-case, the application calls the string directly instead of referencing the string within a response.

When your response consists of two or more parts strung together, each pulling randomly from a predefined list of options, analogous to how a slot machine works. For example, the string may consist of the parts [congratulations] and [another] with content variations defined for each, so that the randomized response will be "Well done! Ready for another?" or "Nice job! Would you like another?" or however many combinations can be made from the number of variations provided for each part. It's similar to variants, but without needing to write out each possible combination.


Objects provide a way for an application to access structured data. Much like Strings, they can be inserted into a response or accessed by the application directly. They can serve a number of purposes, but aren't something that every application needs.

A common use for objects is to store Alexa Directives, such as APL documents, that are used in multiple responses, and incorporated via a cross-reference.

Media Assets

Media assets, such as audio clips or images, that are hosted outside of Jargon can be easily referenced in voice app responses in Jargon. Jargon allows users to store the URL paths for assets, allowing for easy cataloging and referencing. The media assets are organized with a parent path URL, called a collection, which then stores the individual assets for users to easily insert into any component.

How to Assemble Voice App Response Components

A voice app response can include multiple components, including text-to-speech content, audio files, and visuals.

Jargon assembles the various components and delivers them to the application as a single response. This saves the developer from the repetitive work of stitching together multiple response components. It also allows for increasing the complexity of the response over time without added development work. For example, a response might only have one simple speech component to start. Over time, other components and variants (the same message said a different way) can be added without having to change the code.

How to Add Variety to Voice App Responses

For voice content to be engaging, it is a best practice to add variety to voice app responses. Having multiple variants of the same message said in a different way can prevent the voice app from sounding repetitive. In Jargon, each response can contain variants that are randomly selected.

Speech Editor

A voice response can be edited using Jargon's easy-to-use speech editor. Audio clips and dynamic variables can be inserted into text-to-speech content. SSML tags can be added to direct the pronunciation of the synthetic voice. A simulator can play back the response for an immediate quality check, using a variety of available synthetic voices to choose from. Use the “Compare it” button to compare changes made in the speech editor.

WYSIWYG editor for SSML

Speech Synthesis Markup Language (SSML) can be added to text-to-speech responses to control the sound and delivery of the synthetic voice. For example, pauses, emphases, prosody, and other speech effects can be tagged using SSML. Jargon makes the experience easy with its built-in SSML editor and simulator. In addition, Jargon catches and reports SSML errors as you edit. When a syntax error is made, Jargon displays an error message informing you to fix the syntax error. As part of this safeguard, a new release cannot be created until all of the conflicts have been resolved, catching issues early and preventing costly iterations later in the development cycle.


Variables, otherwise known as parameters, are dynamic elements that the application fills. Variable names can be managed in Jargon and configured within a speech component. A variable in Jargon is defined using "{ }" syntax. For example, a response might contain a user’s first name in the following sentence as "Hello {userFirstName}, welcome to Jargon."


There are pieces of content that are reused multiple times throughout a voice experience. To maintain consistency and simplify the management of these phrases, names, or entities, Jargon has the concept of snippets. Snippets can be any piece of content that can be edited once and referenced by many different voice responses. For example, if the synthetic voice always mispronounces the name of your company, a snippet can be created once, marked up with SSML tags, and referenced by many different responses. The same could go for commonly used phrases, like your company’s tagline or product descriptions. A snippet in Jargon is defined using "[ ]" syntax.

Need additional support?