AUTOMATE EDISCOVERY
AND GRAIN INSIGHTS WITH
TRANSCRIPTION
TECHNOLOGY
NICE Compliance
14. April 2020
Housekeeping
Legacy Challenges with Communication Compliance
Understanding The Basics of Transcription
Advanced Technology Best Practices
The NICE Transcription Solution and Demo
Question and Answer
2
Today’s speakers
For more information on today’s topic please also visit: www.nice.com/compliance
Webinar Series: Transforming Communication Compliance
Anton Kaplan
Product Manager
NICE Communications Compliance Line of Business
Anton.Kaplan@nice.com
Marc-Antoine Denechaud
Product Manager
NICE Actimize Communications Surveillance
Marc-Antoine.Denechaud@nieactimize.com
Legacy Challenges
with Communication
Compliance
Expanding Regulations Drive Surveillance Demand
5
FX Code of
Conduct
Global Guidance
Protect Confidential
Information
Complete, Accurate
Data
Measure and Monitor
(updated in) 2016
Dodd-Frank
U.S. Legislation
for Swaps
2013
Record Keeping
(Recording)
Trade Reconstruction
MAR
European Directive
MiFID II
European Directive
2018
Refers
to
Surveillance in Place
to Detect Market Abuse
Record Keeping
(Recording)
Trade Reconstruction
2016
Detecting Market Abuse
Detect and Prove Intent
Even if Unsuccessful
APAC
China to create it’s
version of MiFID II
Hong Kong expects
Authorized Institutions
to adopt a holistic
framework
2018
Regulation
Best
Interest
a new standard of conduct for
US broker-dealers when
recommending securities to
retail customers
July 2020
SM&CR
a new standard of conduct
for UK financial
institution’s employee
governance guidelines
including seniors
managers and others
Dec 2019
What Firms Need with Communication Compliance
6
Capture &
Archive
Prove
Hold
Share
Provision
Apply
Policy
PROVISION
Automatically manage adds, moves
and changes
HOLD
Protect interactions from
being deleted / changed
during investigation
CAPTURE & ARCHIVE
Capture all interactions under regulatory
scope and enable follow up searches and
future retention changes
APPLY POLICY
Apply regulatory policy retention rules and
mark operation for future auditing
PROVE
Understand how many interactions are
expected, confirm the number
recorded, ratify which regulated user
created the content, ensure quality,
retention & stored successfully
SHARE
Export interactions in a
standard format
During internal and
external investigations
Missing calls and not knowing it
Transcripts of calls are time consuming and
costly to retrieve
Lack of centralized administration, and
reporting and auditability required by regulators
No automation to reduce manual activities and
errors
Reduce compliance recording TCO
Bank is requiring applications move to the
cloud
Creating large reproductions of audio/data files
within limited timeframe provided by regulators
Manual checks to confirm recording system is
working
Common Communication Compliance Challenges
7
?
Understanding The
Basics of
Transcription
What is Voice Transcription?
Voice Recordings are
extracted from the recording
system
The voice recordings are
transcribed to text using
Natural Language Processing
(NLP)
The output text files contain a
full transcription of the voice
call
Voice Recordings Voice Transcription Text Output
Why is Voice Transcription Needed?
Legal
Investigations
Efficiency
Challenges
Reducing
Manual Tasks
10
c
by automating the
eDiscovery task so the
speed of the process of
producing data is
increased at lower costs
Include slow manual
process to locate the
data, expensive
transcription services to
convert the data, and
manual third party review
to understand the data,
all contributing to delays
and the risk of not
understanding the
context of the data being
produced
As the size and scope
of regulatory requests
increases, the time
regulators give to
produce information
has decreased
requiring financial
institutions to spend
more to comply
Business
Intelligence
By leveraging the high
quality transcription to
find and leverage best
practices across the
enterprise
Where Can Transcription Be Applied?
11
Automate manual transcription
Transcribe the calls for internal review
Search and navigate transcribed calls for keywords
Reconstruct the trade conversation timeline
Package transcriptions and audio files for external parties
Identify and separate speakers, internal and external
Transcribe multilingual conversations
The Benefits of Voice Transcription
Decreases Investigation Time Improves Efficiency Lower Costs
Fast response to legal
investigation and regulatory
inquiries by providing transcripts
of the audio calls as part of the
case
Search words and phrases within
the transcribed voice calls as
part of an eDiscovery process
Eliminate third party costs -
transcription services and
outside counsel review
Key Technology
Aspects of
Transcription
Accurate Automated Speech Recognition (ASR)
14
Automated Speech Recognition (ASR) consists of linguistic models that enable converting
spoken language into text
Provides a single query interface across all communication channels
Allows for easily using synonym dictionaries to increase recall of relevant recordings
In combination with contextual queries improves result accuracy
What is it?
Why use ASR?
Transcription Service approach
Automated Speech Recognition (ASR) consists of linguistic models that enable converting spoken language into
text
Uses new kind of text algorithms to search Word Alternates from ASR Results.
Transcription Workflow
Dictionary and natural language models are created and tuned for each customer
Audio converted into text at a rate of 5-15x real time
N-Level Text is created and output
Transcription Principle
15
DICTIONARY
AUDIO
DICTIONARY
TRANSCRIPT
OUTPUT N-LEVEL
WORD PROBABILITY
PROCESSING
(Word Level)
NATURAL LANGUAGE MODEL
APPLY DOMAIN MODEL
word
s
word
s
word
s
word
s
1.
2.
3.
SMART INDEX
attatch_p60
“no one will catch me ”
Reduction false positives by applying analytics to relevant languages
Language ID support for multiple languages simultaneously simplifying the configuration
for Speech Conversion
Language Identification
16
DICTIONARY
AUDIO
TEXT
(WORD SPOTTING,
FIXED TRANSCRIPT)
OUTPUT
PROCESSING
NATURAL LANGUAGE MODEL
STRUCTURE
(QUERIES, TAXONOMY,
AD-HOC)
words
words
words
words
SMART INDEX
Dari
Korean -
Russian
GermanHebrew
Bahasa
Italian
Japanese
Korean Russian
Thai
Dutch
Danish
Polish
Tagalog
Spanish
Castilian
Latin American
French
Canadian
European
English
International
Australian
N.A.
UK
Chinese
Cantonese
Mandarin
Telugu
Hindi
Turkish
Brazilian
Portuguese
Machine Learning Speech Analytics Process
17
Detect the
Language
1. Language Identification Models learn how languages sound and
can predict the most likely language used
Convert Audio
to Phonemes
2. Phonetic Models learn the way sounds are pronounced for a
given language and break audio to phonemes
Convert
Phonemes to
Words
3. Language Packs map the sounds to words
Create the
Transcription
4. Phrase Packs predict the most probable words and phrases
Improved Accuracy with NICE Nexidia Speech Analytics Engine
Accuracy is primarily based on the Number of Phonemes the Terms and Phrase has
Factors in trading environment influencing the accuracy are:
Transcription Accuracy Explained
0
0,25
0,5
0,75
1
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Probability of Detection
Number of Phonemes
Example
North American English, Search Phrase with 15 phonemes
85% of the true
occurrences
will be found
85%
Quality of the audio
Volume
Cross talk
Mic Quality
Accent
Speed of talk
Background noise
Mono vs. Stereo Audio
In order to increase the transcription accuracy, with the goal of achieving a 70% or greater
requirements we use a Phrase Pack Training process based on Machine Learning
Learns common phrases to improve the output of the speech analytics
Utilizes related written content to learn domain specific common phrases used by the recorded population to improve transcription results
Simplified Iterative Process that optimizes the output of the transcription using sample audio and text
Transcription Accuracy Improvement Process
19
NEURAL PHONETIC
SPEECH ANALYTICS ENGINE
AND SEARCH GRID
Model Generation
and Update
Model
Applied
to Audio
Customer-specific
Neural-Acoustic Model
Audio, Chat, Email,
etc.
100%
Quantitative Results
Narrative Analysis
MODELING TOOLS
Reliable and
accurate
transcriptions
TEXT
NICE Solution
Deep Dive
NICE Compliance Solution Portfolio
21
Communication
Recording
Compliance
Assurance
Market
Surveillance
Communication
Surveillance
Sales Practices &
Suitability
Self-Development Automation
Integration
Hub
Security
The Industry's Most Unified and Intelligent Compliance Platform
COMMUNICATION COMPLIANCE
HOLISTIC SURVEILLANCE
Transcription
Services
NLP
CASE MANAGEMENT
MACHINE LEARNING
COMPASS Transcription Solutions
COMPASS
Transcription
SDK
COMPASS
Export
Transcription
COMPASS
Discovery
Option with the COMPASS Bulk
Export module to transcribe the
calls as part of the export process
Transcription service that can be
integrated via API with a customers
or partner’s applications.
Separate COMPASS module for
transcription of all audio files, with the
ability to search on the file content.
NICE COMPASS Export Transcription
Key Features
Bulk download provides a secure, auditable
solution
1.7 million calls can be downloaded per day on a
single extraction server
Downloads in the background to allow continued
working
Provides a transcription file for each call
processed
Listen to audio that matches queries
Key Benefits
Provide all trade conversations quickly and efficiently
Simplify intensive compliance investigation
processes
Leverage existing infrastructure and business
systems
Multi-lingual support built for financial services
Select Users and/or
Groups, create Start
and End Date and
trigger operation
Current
Downloads
Progress
Transcription files will be
stored in the same
package as the audio
files providing a
complete list of each
wav file, and metadata
NICE COMPASS SDK
Key Features
Standard API to receive audio file input,
transcribe and output to predetermined
destination
Use input from any recorder (recorder agnostic)
Integration into limitless number of applications
Available with development toolkit
Key Benefits
Provide ALL trade conversations quickly and
efficiently
Simplify intensive compliance investigation
processes
Leverage existing infrastructure and business
systems
Multi-lingual support built for Financial Services
Users can use the sample
transcription engine to
transcribe audio files from any
system
NICE COMPASS Discovery
Search bar to
write queries
Words that match
the query are
highlighted in
every interactions’
transcription
Words are
highlighted in the
audio player
Key Features
Ability to search words and phrases
Ability to narrow down with facets
Provides a transcription for each call
processed
Ability to Download
Download the
audio file
Facets selection
to narrow down
the dataset
Key Benefits
Easily search words and phrases
Automatically mark the section of
the call where the word is spoken
Find the correct calls as part of an
eDiscovery process
COMPASS Transcription Flow
COMPASS
NTR
Customer
Archive
Audio Processed
Discovery Index
Discovery Index
SDK
API
Extract
Location
Extraction
Service
Transcription
Service
EXPORT
SDK
DISCOVERY
How to deploy COMPASS Transcription
Fully managed service using secure AWS
services
NICE COMPASS Bulk download extracts the
audio and delivers audio to cloud
transcription services
Firms log into the NICE COMPASS
Discovery Application in the cloud and can
search transcription results
Deployed seamlessly onsite within you
current NICE COMPASS framework
The audio automatically processed via an
onsite transcription service as part of Data
Extraction Job
COMPASS Bulk download extracts the audio
and includes transcription in the export
package
Firms log into the NICE COMPASS Discovery
Application and can search transcription
results
On Premise Cloud Offering
Language Tuning
and Improvements
FIRST:
LANGUAGE
MODEL
TRAINING
30
Process of Language Model Improvements
DeploymentModel TuningModel TrainingTranscriptionSecure HandoverCall Download
Call Download / Selection
100 hours of handset audio in the relevant
language
Statistical dialect representation of the trading
floor and form minimum 50 traders
Best practices to select audio data from a 2
week window
Files to be downloaded in a WAV format
Secure Handover
NICE & Customer signs a NDA and Data
Protection Agreement
Customer hands over data to NICE though
agreed secure channel
Data stored on secured NICE servers (only
registered persons have access)
Data securely deleted from NICE servers after
usage (under Customers supervision)
Transcription (Manual)
All calls that are handed over needs to be
manually transcribed and tagged with events
(see transcription method documentation)
Transcription of these calls happens under
strict supervision of NICE by transcription
provider (option to replace by customer
certified provider)
Transcription is checked to achieve 99%
quality.
Model Training
After transcription is finished then the NICE
Language Model team take the audio data and
transcription data (leaving 10% for control
testing)
The data is feed into a learning machine that
scans the two sources and creates an acoustic
model (if needed) and a language model
(including a dictionary and NNLP packs)
Model Tuning
Based on the results of the testing the model
can be further tuned, if the expected results
have not been reached
Acoustic and language models can be tuned
manually or additional data sets can be used
Deployment
When the required quality has been reached
the models are ready for deployment
SECOND:
PHRASE PACK
TRAINING
32
Phrase Pack Training Overview
Twinkle Twinkle Little Star
Analyze manually transcribed training
data to improve probability scores and
identify new words
Process Used for Phrase Pack Improvement
1. Manual
Transcription
2. Update Default
Phrase Pack
using Text
3. Add New Word
Pronunciation
4. Deploy and Test Model
and Analyze Calls
33
EXAMPLE:
TRAINING A
NEW WORD
New word: Coronavirus
Speaker #1:
Good morning John, this coronavirus crisis is impressive. We need to make sure we are securing
our positions. Global markets are plunging after the implosion of an alliance between OPEC and
Russia, it caused the worst one-day crash in crude prices in nearly 30 years!
Speaker #2:
Absolutely! Also this coronavirus pandemic is brutal upon the retail industriesHowever I have
noticed that the technology companies are doing great short term with the remote working
situation. We also need to bet on airlines next, with all the governmentsinvestments to protect
that industry it will come back up.
Phrase Pack Training demo script
35
Phrase Pack Training demo script
36
THIRD:
ADJUST
SETTINGS
Voice Surveillance Objective
38
Optimize Relevant Review Items
Find relevant
interactions
False positive alerts
increase the workload
39
Why NICE for Transcription
40
Eliminate third party
costs - transcription
services and
outside counsel
review
Reduces efforts
needed to
transcribe
conversations as
part of the
investigation
process
Transcription
accuracy of 70%+
highest in the
industry
leverages machine
learning and NLP
advanced analytics
Quickly export and
transcribe
communications
ensuring regulatory
deadlines or
business
requirements are
met
Multiple options
available to ensure
limited change to
business processes
NICE Transcription Capabilities for Financial Services
Highest
Transcription
Accuracy
Flexible
Transcription
Options
Automated
Processes
Lower Total
Costs
Reduced Risk
41
42
March
19
Best Practices in Unified Communications Recording
April
15
Unlocking the Potential of Cloud Compliance Recording
May
14
Automate eDiscovery and Gain insights with Transcription
Technology
June
11
Maximize Uptime and Lower TCO with Managed Services
Transforming Communication Compliance Webinar Series