Date Published: November 19, 2019 - Last Updated 4 Years, 5 Days, 3 Hours, 12 Minutes ago
When a customer calls into a contact center, they want a good experience,
and they want to receive accurate information in a timely manner so that
they don’t have to call back. Fulfilling these wants will ensure that they
remain satisfied and loyal.
When it comes to contact centers, time is often equated with money. Metrics
like Average Handle Time (AHT) and First Call Resolution (FCR) are given a
lot of emphasis. These metrics are important for the contact center and
The longer a customer service representative (CSR) stays on the line with a
customer, the more cost the center incurs. A longer AHT also drives down
customer satisfaction. The cost to the center is tangible: it costs more
per contact to have a live representative on the phone than it does to set
up a system in which customers self-serve and find answers to their own
The higher cost is also intangible. Customers lose faith in CSRs and the
companies they represent when they are kept on the line waiting or on hold
for answers to queries, which results in a negative perception and loss of
trust by the customer.
More and more, contact centers are employing voice technology to speed up
the time to resolution. Customers have been comfortable with using
interactive voice response (IVR) technology for some time now. On the other
side of the interaction, contact centers are relying on real-time and
post-call voice transcriptions to gain insight into customer behaviors and
needs, using custom algorithms driven by machine learning. Some of this
information used to be gleaned from the IVR in the form of Intent or Call
Reason or after the call as Call Disposition.
Problems occur when managers try to stretch voice transcription technology
beyond its limits. Voice technologies, including machine-based
transcription and voice biometrics, have a place in the contact center, but
they cannot and should not take the place of live agents and human
touchpoints. Technology should equip agents with precise information to
best serve customers. Managers need a balance of technology and physical
agents to best serve customers.
Transcription Software: Limitations
Presently, contact centers use voice transcription services to gain insight
into the basic data of a call but not the actual content of the customer
call. This data, such as the identities of the agent and caller, the
caller’s contact information and the AHT, is just not enough. Organizations
should use voice transcription to drive analytics, which lend insight into
big trends that drive call volume. These analytics help agencies perform
root-cause analysis, which then drives operations into a Continuous
Improvement mode. Analytics also help agencies weed out things that
shouldn’t happen, find use cases for robotics and process automation, find
workflow efficiencies, and gain an understanding of which products or
materials the agent used to drive calls to resolution.
But, however accurate voice transcription services purport to be, managers
are lucky if transcriptions attain more than 50 percent accuracy in
capturing what was said during a call. So while using transcription for
higher-level trend identification makes a lot of sense, the reality is the
technology is not quite there. This quality issue exists because calls lack
If we look at a different voice technology, voice recognition, we see how
critical this context can be for supporting analysis. When a customer
encounters an IVR menu and gives an ID number or responds to a prompt, the
system knows what kind of data to expect. It can confirm the information at
the time of interaction (“You said you want to change your address—is that
Transcriptions lack that context. Add in the imperfections of
recording—background noise, a lack of clarity on the speaker’s part, or bad
cell phone reception—and automated voice transcription software cannot
provide a clear window into the data managers are trying to capture.
While agencies are waiting for transcription accuracy to improve beyond the
50 to 60 percent range, there is still a place for voice transcription
software. Combining the (imperfect) results of transcription with agents’
intelligence and recall, can provide insights that will give customers the
best experience possible when they call in.
Transcription Software: A Good Use Case
Voice transcription currently works best when it helps CSRs work smarter.
In the recent past, CSRs relied on keywords to find the answers to customer
queries. Rather than having to parse thousands of different scripts to find
answers, CSRs could search for a keyword, or a combination of keywords, to
find the information they needed.
This approach meant that CSRs had to memorize thousands of keywords, which
is a lot to ask. Fortunately, as contact center knowledge repositories have
improved, agencies have implemented natural language searches. CSRs can
enter words just as people enter search terms on Google and refine their
results as they go.
Voice transcription can further help CSRs in their quest to find answers.
By “listening” to calls, transcription software works in tandem with an
agency’s natural language query system to suggest information that an agent
might be looking for. While experienced agents may not need suggestions,
seasonal agents hired for surges will rely on them to find the answers they
are looking for.
The more machine-based voice transcription systems are used, the more
intelligent they become. When coupled with natural language search engines,
agencies effectively create a natural language cognitive search engine.
This will be a powerful tool for contact center agents to use when
assisting callers with queries.
Voice Biometrics: Using Voiceprints
While machine-based transcription programs are still in their infancy and
natural language cognitive searching is still a few years away, the use of
voice biometrics in a contact center environment is growing. Using a
customer’s voice to confirm that caller’s identity is much faster than
asking and answering three or four different verification questions (last
name, phone number, the name of your first pet, etc.).
A simple voiceprint, captured as a WAV file when a customer calls for the
first time, is easy to match and nearly impossible for another person to
replicate. Once captured, voiceprints are typically added to a library of
known customers, which helps eliminate the need to authenticate identity
via other means. CSRs can quickly move into the reason for the call, saving
a caller’s time and an agency’s capital. The future is moving away from
knowledge-based authentication toward a hybrid approach of voice biometrics
and a hybrid approach of voice biometrics and real-time technology often
used in fraud detection and forensic analysis.
As the technology catches up to the applications, voice biometrics, a
natural-language search engine, and voice transcription software will work
alongside CSRs will create a swift and powerful response to whatever query
a customer might have. Biometrics will verify a customer’s identity so that
agents can move to the service part of the call. Once CSRs understand the
problem, a natural language search engine, made more intelligent by input
from voice transcription, will speed time to resolution. Both caller and
organization will be happy with the outcome.