How do agents work?

As we mentioned previously, an agent helps you process user input into structured data that you can use to return an appropriate response. You define all of these things inside one or many intents, which define how to map user input to a corresponding response.

Let's take a look at an intent to get a better idea of how it works. A basic intent is comprised of these components:

  • Training phrases: Defines example phrases of what users can say. Dialogflow uses these training phrases and naturally expands them to many more similar phrases to build a language model that accurately matches user input. Through further training and machine learning, Dialogflow builds a more robust and varied language model to better match user input.

  • Action and Parameters: To improve an intent's language model, you can also annotate your training phrases with entities, or categories of data that you want Dialogflow to match. This lets you tell Dialogflow that you want a particular type of input and to not just match the literal input of the user. Dialogflow extracts matched entities as parameters from the training phrases. You can then process these parameters in logic called fulfillment to further customize a response to the user. You'll learn about fulfillment later in this document.

  • Responses: Defines a text, speech, or visual response to the user, which usually prompts users in a way that lets them know what to say next or that the conversation is ending. To send responses, you can use Dialogflow's built-in response handler or call fulfillment to process the extracted data and return a response back to Dialogflow.

Now let's see how these three major parts of an intent work together.

  1. When users say something, referred to as an utterance, your agent matches the utterance to an appropriate intent, otherwise known as intent classification. An intent is matched if the language model for that intent can closely or exactly match the user utterance. You define the language model by specifying training phrases, or examples of things users might want to say. Dialogflow takes these training phrases and expands upon them to create the intent's language model.

  2. Once your agent matches an intent, it extracts parameters that you need from the utterance. This can be a color, name, date, or a host of other data categories called entities. Dialogflow defines a large set entities that categorize extracted parameters, or you can create your own. You define what to extract in your training phrases as well, annotating specific parts of the training phrases to specify what parameters you want to extract.

  3. You then send a response that can either prompt users for more information to continue the conversation or to just end the conversation. If more information is required, this back and forth happens again. Your agent matches a user utterance with an intent, extracts parameters, and returns a response. Dialogflow includes an easy-to-use response handler to return simple, usually static responses. If you want to return more catered responses, you can use logic called fulfillment to process any extracted parameters and return a response that is more dynamic or useful.

In the next section, you'll learn more about fulfillment and how to use it to do cool things with the parameters that Dialogflow extracted.