Measuring the quality of a Virtual Assistant: 3 ways to measure the Assertiveness Rate
Murilo Medeiros May 27, 2022
When a Virtual Assistant goes public, companies face multiple questions that ultimately have to do with quality. How do I measure the quality of my conversational solution?
One way to measure the quality of our Virtual Assistant training is to apply an assertiveness measurement test.
Although the meaning of this last term expresses a social skill, it is currently used within the community to describe the ability of virtual assistants to give a correct or appropriate response to a specific question from a user who has expressed himself in a way that was not directly trained in the chatbot or virtual assistant. There are several ways to correctly measure assertiveness, but they can be grouped into three main ways to measure that increase in complexity and cost.

1. Indirect rate of assertiveness:

When we talk about a fallback, we are talking about a response where the assistant was not trained and responded with a message like "I didn't understand". In this way, you can create the easiest indicator of assertiveness, which would be to take the total number of fallbacks and divide it by the number of interactions that came into the bot during a period. This would give a fallback rate, and its complement would be the assertiveness, so we are talking about an indirect assertiveness rate. It serves to roughly know how much volume of questions are coming in that the bot has not been trained for, answering that it doesn't understand.

2. Strict assertiveness rate:

At the other extreme, the most complex way to measure assertiveness requires the common agreement of two or more parties that select a representative sample of inputs or real examples of users with which the system will be measured and then manually annotate each of the inputs with their outputs, i.e., the response that the system actually gave, and identify whether the sentence belongs to the bot's knowledge domain and whether the classification or response it delivered was adequate or not. Once the group of annotators has made the relevant evaluation of the same set of data, the degree of agreement among them is evaluated, because it is possible that some of them may have considered that everything was relevant and adequate in a random way. A simple statistical test allows to solve that, creating an annotated collection of great value for further training improvement. The work is cumbersome and time-consuming and even requires some training for the annotators. This way of measuring the Strict Assertiveness Rate is recommended only in cases where the indicator is linked to some obligation that requires formal demonstration.

3. Semi-Automated Assertiveness Rate:

An intermediate approach is the Semi-Automated Assertiveness Rate calculation procedure, which saves time and is often an ideal formula in agile contexts where the quality of our Virtual Assistant needs to be measured and updated by demonstrating its value. Depending on the type of conversational solution, the calculation will be made by first identifying all the training, linking it with the answers that will be measured. With this input, a table is generated where the actual sentences and the response that "should" have been received. This task is usually abbreviated by simply using the intent that should have classified that sentence. Because in practice manual effort is usually required in this part, the "semi" part of the indicator's name comes up. In some cases, it is possible to automate the entire flow from start to finish, but there are often conditions that make this task difficult. Then, a second external bot will "send" the sentences to the virtual assistant. The wizard will respond with its answer and that answer will be saved, giving rise to a data collection containing each of the actual user inputs, the classification that should have been delivered and the classification that was delivered. Finally, a matrix is created with the frequency of correct and incorrect classifications, thus creating the assertiveness rate indicator par excellence, which allows us to identify with a good level of detail and relatively quickly which are the knowledge domains that the bot does not handle and in which the training fails more in a familiar indicator expressed as a percentage. The first insight we have seen generated in these measurement experiences is the need to merge some answers together, to avoid confusing the dialog engine that runs the wizard. There are an infinite number of ways to combine these measurements and the three levels are rather didactic to describe their complexity. Usually, more steps are added to the measurement as each virtual assistant's own requirements emerge. Having a proper measurement of the assertiveness of our bot will ensure its quality with the support of an indicator that impacts the user experience and the final evaluation of the virtual assistant. With the measurement comes a subsequent re-training process that must be carried out carefully to avoid diminishing the generalization capacity of the model on new cases for which it was not trained. Another interesting read: A virtual assistant said: I’m sorry, I didn’t understand correctly, I’m still learning, can you write it another way?  
Must News
Technology helps us communicate at times when social distancing is mandatory
Murilo Medeiros January 26, 2021

Technology helps us communicate at times when social distancing is mandatory The COVID-19 pandemic has disrupted our lives in different ways, but technology can help us stay connected in these times of isolation. Stores, offices, and call centers are closing, and millions of people are confined to their homes. Working from home is the new standard.

However, the home office does not cover 100% of the operation. Companies are facing millions of consumers that must be assisted, and now they have less capacity to perform this service. Even worse, the COVID-19 pandemic creates a series of new problems that consumers must to solve: managing delayed debt payments, canceling and rescheduling trips, booking hotels, airline tickets, and attending medical appointments via telemedicine. Telephone and digital contacts between consumers and businesses continue to grow.

Consumers need to talk to companies; brands need to talk to their audiences.

How can technology help?

Today, due to the Virtual Assistant (VA), also called bots, conversations can be automated in several channels, such as: WhatsApp, SMS, phone calls, Apple Business Chat, Google Home, Amazon Echo, Web chat, chat apps, totems, and robots. Most commercial use cases can be automated with a Virtual Assistant such as checking an account balance, paying a bill, buying goods, activating and cancelling a service, transferring money, scheduling, confirming or cancelling an appointment via telemedicine. The work of Virtual Assistants increases during the COVID-19 pandemic. During these days of isolation, companies report a five-fold growth in the use of Virtual Assistants via WhatsApp.

So, what should I do if my company does not have a Virtual Assistant?

  • Do it yourself: You can find several tools, platforms, solutions on the market that could meet your needs. The main variables you should consider are the size of your company, the complexity of the use cases you want to automate, your current system and applications, as well as your budget. You will also need a talented and motivated team.
  • Seek help: Many companies offer their services and have the necessary tools and equipment. The results, prices, and impact on business will be very different, depending on the tools, expertise level of the team, and methodology.

Which use cases should I automate?

First think of your goal: cutting costs, increasing revenue, improving user experience. Then, define your KPIs and make a comparative study of the current situation with the future situation for each use case. You are now ready to prioritize.

Group of multicultural friends using smartphone outdoors - People hands addicted by mobile smart phone - Technology concept with connected men and women - Shallow depth of field on vintage filter tone

Another important question: how long would it take to develop and implement a solution like this?

The time necessary depends a lot on the channel, the use case complexity, and your IT resources. Web chat use cases can be solved in a few days, cases where transactions via WhatsApp take more than a week, and implementing a cognitive contact center takes a few weeks. Remember that selecting a team is essential to create the virtual assistant that best suits the needs of your business.

Therefore, I recommend that you consult the eva.bot platform.

Must News