Use data to improve agents

Dialogflow's Training feature provides tools to improve the performance of your conversational agent using data collected from both general usage and external sources. For example, you could train your agent using existing logs of customer service interactions between clients and human agents.

Training helps you improve your agent's precision in matching user inputs to the correct intent. It also adds new intents based on what customers ask for, which extends your agent's coverage.

Training is often done to improve the performance of an existing Dialogflow agent. However, it can also be used to help bootstrap a newly created agent. To begin with, we'll focus on how to train an existing Dialogflow agent.

Train an existing agent

Unless you have disabled logging via Settings, your Dialogflow agent stores a record of all the conversations it has had. The Training tool allows you to view all of the inbound customer requests, along with the intents that they were matched to. The example below uses our fulfillment-bike-shop-nodejs sample.

Workflow for training an agent

The Training tool shows a list of conversations in chronological order:

Each conversation is a log of inbound requests from a particular customer, grouped by time. The first user query of the conversation is displayed in the table. If the same customer talks with your agent on two separate occasions, the logs will show up as two distinct conversations.

To train your agent, start by selecting one of the conversations from the list. You'll be presented with a list of user requests. These represent all of the user's inputs to the agent. The actual user input is given in the box next to USER SAYS. The agent's responses are not shown in Training, but you can view them on the History page.

For each request in the conversation you can perform multiple tasks:

  • Add the request as a training phrase to an intent.
  • Add the request as a training phrase to the Default Fallback Intent.
  • Add and edit entity annotations for the request.
  • Delete the request from the training view.

Step through the requests in a conversation. For each request, look at the intent that was matched. You can use your understanding of the agent's intents to decide whether a given request was correctly matched.

For example, in the following case, the request has been matched to the wrong intent. You can click the intent name to assign this request as a training phrase for the appropriate intent, so similar requests are correctly matched in the future.

If a request was matched to the correct intent, you can choose to add it as a training phrase to that intent. This is recommended to improve the accuracy of your agent's matching. To add the intent as a training phrase, click the check mark next to the request.

Sometimes, a request is mistakenly matched to an intent when it should have been treated as an unknown input. You can train Dialogflow to treat these requests as irrelevant by adding them to the Default Fallback Intent, or any other fallback intent that is in context at the time.

You can also use the training view to edit entity annotations. For example, if an entity value was missed, you can annotate the request so that it is captured correctly in the future. If you identify new possible entity values during training, you should also add the word or word sequence to the corresponding entity type.

The actions you take during training will not be applied to your agent until you click APPROVE at the top right of the training view. Once approved, your agent will begin training, and you will be notified when it is complete. In the Training tool, the conversations that have been used for training will be marked with a green check mark.

When to train your agent

Training is critical to building an agent that can handle most requests. You should train your agent at the following times:

After testing your initial design with a limited number of users

Once you've built the first iteration of your agent, test with a small group of users (for example, a QA team) by having them engage in natural conversation with your agent. This process will result in conversation logs that you can use for training.

Since your initial iteration is likely to see a large number of unmatched requests, starting with a small group of users means that you will not feel overwhelmed by the volume of training work. Once you incorporate new training phrases, the precision of your agent's matching will dramatically improve.

To easily share your agent with testers, you can use Dialogflow's web widget, or use integrations to connect your agent with whatever instant messaging platform is already used by your team.

After your agent has been used in production for a short time

It's likely that real world usage will result in significantly different requests than those generated by your initial test users. Training your agent after your agent has been in production for a short while allows you to incorporate these requests and improve the precision of your agent under real usage.

On a periodic basis once your agent is working in production

As users communicate with your agent, it may encounter new inputs that are not correctly matched. By periodically visiting the training tool, you can incorporate these requests and continually improve your agent.

Build an agent from existing data

In many cases, you might have access to pre-existing conversation logs. For example, you may have logs of past conversations with human customer service agents. In this section, we'll explain how to process this data and use it to train your Dialogflow agent.

Why use existing data?

Since the matching precision of Dialogflow agents is based on the quality of their training data, using real world conversation logs is a great way to easily improve performance.

How does Dialogflow use existing data?

Conversation logs can be uploaded into Dialogflow's Training tool, where they are listed in the same way as request logs that are collected from regular agent usage. Once uploaded, you can use the training workflow outlined above to incorporate the logs into your agent.

What kind of data should you collect?

You can collect conversation logs from any source, as long as the interactions you collect are relevant to your agent's intents. Since every log you upload to Dialogflow will appear in the Training view, you should take care to only upload logs that could potentially be incorporated into your agent as training phrases.

Logs should be in the form of plain text, and you should only upload the customer's side of the conversation. Since Training only involves user requests, there is no benefit to uploading logs of what the customer service agent says.

The following can often be useful sources of data:

  • Logs of conversations with human customer service agents.
  • Online customer support requests (email, forums, FAQs).
  • Customer questions on social media.

You should avoid the following types of data:

  • Long-form, non-conversational requests.
  • Multi-line requests.
  • Requests that are not relevant to any of the intents in your agent.
  • Logs of things said by non-customers (for example, responses from customer service agents).

Collect and clean the data

Training with external data is easiest if you train a single intent at a time.

The following is a suggested workflow for collecting the data for an intent:

  1. Save all of your conversation logs to a datastore, such as a database table or spreadsheet.
  2. Filter and remove any requests that did not come from the user (for example, the replies from customer service agents).
  3. Strip out any metadata, leaving only the conversation text. For example, you should remove any dates, names, sources, or private data.
  4. Select a specific Dialogflow intent you wish to train.
  5. Use tools such as keyword search and regular expressions to filter the requests, leaving mostly those that seem to match your chosen intent. It's okay if some of the requests are not relevant; you can skip over these in training.
  6. Export the selected requests to a .txt file with one request per line. Do not include any additional fields beyond the raw request.

Upload conversation logs

Dialogflow's Training feature allows you to upload text files containing conversation logs. This data is displayed in the Training view in the same format as data that was collected by the agent itself.

The upload facility accepts either a single text file with one log per line or a zip archive containing up to 10 text files. A single text file, or the total unpacked size of all the text files in a zip, should not exceed 3 MB.

If you have collected request logs for multiple intents, you should create one text file for each intent. This ensures that the logs are displayed separately in Dialogflow's Training view.

Training workflow

Uploaded request logs appear in the Training view in the same way that production agent logs do. This means that to train your agent, you can follow the same workflow outlined above in Train an existing agent. If each file you upload contains only requests for a specific intent, you should be able to quickly scan through the requests and assign them appropriately.

Design an agent based on existing data

If you're still in the process of designing your agent, you can use the data you've collected to inform your design. You can follow this workflow:

  1. Upload your request logs to the Training view of a newly created agent.
  2. For each request, decide whether the request is representative of a particular intent that should exist within your agent. If the request does not seem appropriate for the agent, discard it by choosing the trash icon.

    If you have not seen a request like this before, but you feel it is representative of an intent, use the Click to assign interface to create a new intent.

    If an intent already exists that fits this request, use the interface to assign the request as a training phrase for that intent.

  3. Follow this process with all of the uploaded request logs until you have created a set of intents based on the logs. You can use these intents as the basis of your new agent.