Automate eDiscovery and Grain Insights with Transcription

AUTOMATE EDISCOVERY

AND GRAIN INSIGHTS WITH

TRANSCRIPTION

TECHNOLOGY

NICE Compliance

14. April 2020

• Housekeeping

• Legacy Challenges with Communication Compliance

• Understanding The Basics of Transcription

• Advanced Technology Best Practices

• The NICE Transcription Solution and Demo

• Question and Answer

Today’s speakers

For more information on today’s topic please also visit: www.nice.com/compliance

Webinar Series: Transforming Communication Compliance

Anton Kaplan

Product Manager

NICE Communications Compliance Line of Business

Anton.Kaplan@nice.com

Marc-Antoine Denechaud

Product Manager

NICE Actimize Communications Surveillance

Marc-Antoine.Denechaud@nieactimize.com

Legacy Challenges

with Communication

Compliance

Expanding Regulations Drive Surveillance Demand

FX Code of

Conduct

Global Guidance

• Protect Confidential

Information

• Complete, Accurate

Data

• Measure and Monitor

(updated in) 2016

Dodd-Frank

U.S. Legislation

for Swaps

2013

• Record Keeping

(Recording)

• Trade Reconstruction

MAR

European Directive

MiFID II

European Directive

2018

Refers

• Surveillance in Place

to Detect Market Abuse

• Record Keeping

(Recording)

• Trade Reconstruction

2016

• Detecting Market Abuse

• Detect and Prove Intent

Even if Unsuccessful

APAC

• China to create it’s

version of MiFID II

• Hong Kong expects

Authorized Institutions

to adopt a holistic

framework

2018

Regulation

Best

Interest

a new standard of conduct for

US broker-dealers when

recommending securities to

retail customers

July 2020

SM&CR

a new standard of conduct

for UK financial

institution’s employee

governance guidelines

including seniors

managers and others

Dec 2019

What Firms Need with Communication Compliance

Capture &

Archive

Prove

Hold

Provision

Apply

Policy

PROVISION

Automatically manage adds, moves

and changes

HOLD

Protect interactions from

being deleted / changed

during investigation

CAPTURE & ARCHIVE

Capture all interactions under regulatory

scope and enable follow up searches and

future retention changes

APPLY POLICY

Apply regulatory policy retention rules and

mark operation for future auditing

PROVE

Understand how many interactions are

expected, confirm the number

recorded, ratify which regulated user

created the content, ensure quality,

retention & stored successfully

Export interactions in a

standard format

During internal and

external investigations

Missing calls and not knowing it

Transcripts of calls are time consuming and

costly to retrieve

Lack of centralized administration, and

reporting and auditability required by regulators

No automation to reduce manual activities and

errors

Reduce compliance recording TCO

Bank is requiring applications move to the

cloud

Creating large reproductions of audio/data files

within limited timeframe provided by regulators

Manual checks to confirm recording system is

working

Common Communication Compliance Challenges

Understanding The

Basics of

Transcription

What is Voice Transcription?

Voice Recordings are

extracted from the recording

system

The voice recordings are

transcribed to text using

Natural Language Processing

(NLP)

The output text files contain a

full transcription of the voice

call

Voice Recordings Voice Transcription Text Output

Why is Voice Transcription Needed?

Legal

Investigations

Efficiency

Challenges

Reducing

Manual Tasks

by automating the

eDiscovery task so the

speed of the process of

producing data is

increased at lower costs

Include slow manual

process to locate the

data, expensive

transcription services to

convert the data, and

manual third party review

to understand the data,

all contributing to delays

and the risk of not

understanding the

context of the data being

produced

As the size and scope

of regulatory requests

increases, the time

regulators give to

produce information

has decreased

requiring financial

institutions to spend

more to comply

Business

Intelligence

By leveraging the high

quality transcription to

find and leverage best

practices across the

enterprise

Where Can Transcription Be Applied?

Automate manual transcription

Transcribe the calls for internal review

Search and navigate transcribed calls for keywords

Reconstruct the trade conversation timeline

Package transcriptions and audio files for external parties

Identify and separate speakers, internal and external

Transcribe multilingual conversations

The Benefits of Voice Transcription

Decreases Investigation Time Improves Efficiency Lower Costs

• Fast response to legal

investigation and regulatory

inquiries by providing transcripts

of the audio calls as part of the

case

• Search words and phrases within

the transcribed voice calls as

part of an eDiscovery process

• Eliminate third party costs -

transcription services and

outside counsel review

Key Technology

Aspects of

Transcription

Accurate Automated Speech Recognition (ASR)

• Automated Speech Recognition (ASR) consists of linguistic models that enable converting

spoken language into text

• Provides a single query interface across all communication channels

• Allows for easily using synonym dictionaries to increase recall of relevant recordings

• In combination with contextual queries improves result accuracy

What is it?

Why use ASR?

• Transcription Service approach

• Automated Speech Recognition (ASR) consists of linguistic models that enable converting spoken language into

text

• Uses new kind of text algorithms to search Word Alternates from ASR Results.

• Transcription Workflow

• Dictionary and natural language models are created and tuned for each customer

• Audio converted into text at a rate of 5-15x real time

• N-Level Text is created and output

Transcription Principle

DICTIONARY

AUDIO

DICTIONARY

TRANSCRIPT

OUTPUT N-LEVEL

WORD PROBABILITY

PROCESSING

(Word Level)

NATURAL LANGUAGE MODEL

APPLY DOMAIN MODEL

word

SMART INDEX

attatch_p60

“no one will catch me ”

• Reduction false positives by applying analytics to relevant languages

• Language ID support for multiple languages simultaneously simplifying the configuration

for Speech Conversion

Language Identification

DICTIONARY

AUDIO

TEXT

(WORD SPOTTING,

FIXED TRANSCRIPT)

OUTPUT

PROCESSING

NATURAL LANGUAGE MODEL

STRUCTURE

(QUERIES, TAXONOMY,

AD-HOC)

words

SMART INDEX

Dari

Korean -

Russian

GermanHebrew

Bahasa

Italian

Japanese

Korean Russian

Thai

Dutch

Danish

Polish

Tagalog

Spanish

• Castilian

• Latin American

French

• Canadian

• European

English

• International

• Australian

• N.A.

• UK

Chinese

• Cantonese

• Mandarin

Telugu

Hindi

Turkish

Brazilian

Portuguese

Machine Learning Speech Analytics Process

Detect the

Language

1. Language Identification Models learn how languages sound and

can predict the most likely language used

Convert Audio

to Phonemes

2. Phonetic Models learn the way sounds are pronounced for a

given language and break audio to phonemes

Convert

Phonemes to

Words

3. Language Packs map the sounds to words

Create the

Transcription

4. Phrase Packs predict the most probable words and phrases

• Improved Accuracy with NICE Nexidia Speech Analytics Engine

• Accuracy is primarily based on the Number of Phonemes the Terms and Phrase has

Factors in trading environment influencing the accuracy are:

Transcription Accuracy Explained

0,25

0,5

0,75

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Probability of Detection

Number of Phonemes

Example

North American English, Search Phrase with 15 phonemes

85% of the true

occurrences

will be found

85%

•

Quality of the audio

•

Volume

•

Cross talk

•

Mic Quality

•

Accent

•

Speed of talk

•

Background noise

•

Mono vs. Stereo Audio

• In order to increase the transcription accuracy, with the goal of achieving a 70% or greater

requirements we use a Phrase Pack Training process based on Machine Learning

▪ Learns common phrases to improve the output of the speech analytics

▪ Utilizes related written content to learn domain specific common phrases used by the recorded population to improve transcription results

▪ Simplified Iterative Process that optimizes the output of the transcription using sample audio and text

Transcription Accuracy Improvement Process

NEURAL PHONETIC

SPEECH ANALYTICS ENGINE

AND SEARCH GRID

Model Generation

and Update

Model

Applied

to Audio

Customer-specific

Neural-Acoustic Model

Audio, Chat, Email,

etc.

100%

Quantitative Results

Narrative Analysis

MODELING TOOLS

Reliable and

accurate

transcriptions

TEXT

NICE Solution

Deep Dive

NICE Compliance Solution Portfolio

Communication

Recording

Compliance

Assurance

Market

Surveillance

Communication

Surveillance

Sales Practices &

Suitability

Self-Development Automation

Integration

Hub

Security

The Industry's Most Unified and Intelligent Compliance Platform

COMMUNICATION COMPLIANCE

HOLISTIC SURVEILLANCE

Transcription

Services

NLP

CASE MANAGEMENT

MACHINE LEARNING

COMPASS Transcription Solutions

COMPASS

Transcription

SDK

COMPASS

Export

Transcription

COMPASS

Discovery

Option with the COMPASS Bulk

Export module to transcribe the

calls as part of the export process

Transcription service that can be

integrated via API with a customer’s

or partner’s applications.

Separate COMPASS module for

transcription of all audio files, with the

ability to search on the file content.

NICE COMPASS Export Transcription

Key Features

• Bulk download provides a secure, auditable

solution

• 1.7 million calls can be downloaded per day on a

single extraction server

• Downloads in the background to allow continued

working

• Provides a transcription file for each call

processed

• Listen to audio that matches queries

Key Benefits

• Provide all trade conversations quickly and efficiently

• Simplify intensive compliance investigation

processes

• Leverage existing infrastructure and business

systems

• Multi-lingual support built for financial services

Select Users and/or

Groups, create Start

and End Date and

trigger operation

Current

Downloads

Progress

Transcription files will be

stored in the same

package as the audio

files providing a

complete list of each

wav file, and metadata

NICE COMPASS SDK

Key Features

• Standard API to receive audio file input,

transcribe and output to predetermined

destination

• Use input from any recorder (recorder agnostic)

• Integration into limitless number of applications

• Available with development toolkit

Key Benefits

• Provide ALL trade conversations quickly and

efficiently

• Simplify intensive compliance investigation

processes

• Leverage existing infrastructure and business

systems

• Multi-lingual support built for Financial Services

Users can use the sample

transcription engine to

transcribe audio files from any

system

NICE COMPASS Discovery

Search bar to

write queries

Words that match

the query are

highlighted in

every interactions’

transcription

Words are

highlighted in the

audio player

Key Features

• Ability to search words and phrases

• Ability to narrow down with facets

• Provides a transcription for each call

processed

• Ability to Download

Download the

audio file

Facets selection

to narrow down

the dataset

Key Benefits

• Easily search words and phrases

• Automatically mark the section of

the call where the word is spoken

• Find the correct calls as part of an

eDiscovery process

COMPASS Transcription Flow

COMPASS

NTR

Customer

Archive

Audio Processed

Discovery Index

SDK

API

Extract

Location

Extraction

Service

Transcription

Service

EXPORT

SDK

DISCOVERY

How to deploy COMPASS Transcription

• Fully managed service using secure AWS

services

• NICE COMPASS Bulk download extracts the

audio and delivers audio to cloud

transcription services

• Firms log into the NICE COMPASS

Discovery Application in the cloud and can

search transcription results

• Deployed seamlessly onsite within you

current NICE COMPASS framework

• The audio automatically processed via an

onsite transcription service as part of Data

Extraction Job

• COMPASS Bulk download extracts the audio

and includes transcription in the export

package

• Firms log into the NICE COMPASS Discovery

Application and can search transcription

results

On Premise Cloud Offering

Language Tuning

and Improvements

FIRST:

LANGUAGE

MODEL

TRAINING

Process of Language Model Improvements

DeploymentModel TuningModel TrainingTranscriptionSecure HandoverCall Download

Call Download / Selection

▪ 100 hours of handset audio in the relevant

language

▪ Statistical dialect representation of the trading

floor and form minimum 50 traders

▪ Best practices to select audio data from a 2

week window

▪ Files to be downloaded in a WAV format

Secure Handover

▪ NICE & Customer signs a NDA and Data

Protection Agreement

▪ Customer hands over data to NICE though

agreed secure channel

▪ Data stored on secured NICE servers (only

registered persons have access)

▪ Data securely deleted from NICE servers after

usage (under Customers supervision)

Transcription (Manual)

▪ All calls that are handed over needs to be

manually transcribed and tagged with events

(see transcription method documentation)

▪ Transcription of these calls happens under

strict supervision of NICE by transcription

provider (option to replace by customer

certified provider)

▪ Transcription is checked to achieve 99%

quality.

Model Training

▪ After transcription is finished then the NICE

Language Model team take the audio data and

transcription data (leaving 10% for control

testing)

▪ The data is feed into a learning machine that

scans the two sources and creates an acoustic

model (if needed) and a language model

(including a dictionary and NNLP packs)

Model Tuning

▪ Based on the results of the testing the model

can be further tuned, if the expected results

have not been reached

▪ Acoustic and language models can be tuned

manually or additional data sets can be used

Deployment

▪ When the required quality has been reached

the models are ready for deployment

SECOND:

PHRASE PACK

TRAINING

Phrase Pack Training Overview

Twinkle Twinkle Little Star

• Analyze manually transcribed training

data to improve probability scores and

identify new words

Process Used for Phrase Pack Improvement

1. Manual

Transcription

2. Update Default

Phrase Pack

using Text

3. Add New Word

Pronunciation

4. Deploy and Test Model

and Analyze Calls

EXAMPLE:

TRAINING A

NEW WORD

• New word: “Coronavirus”

• Speaker #1:

• Good morning John, this coronavirus crisis is impressive. We need to make sure we are securing

our positions. Global markets are plunging after the implosion of an alliance between OPEC and

Russia, it caused the worst one-day crash in crude prices in nearly 30 years!

• Speaker #2:

• Absolutely! Also this coronavirus pandemic is brutal upon the retail industries… However I have

noticed that the technology companies are doing great short term with the remote working

situation. We also need to bet on airlines next, with all the governments’ investments to protect

that industry it will come back up.

Phrase Pack Training demo script

THIRD:

ADJUST

SETTINGS

Voice Surveillance Objective

Optimize Relevant Review Items

Find relevant

interactions

False positive alerts

increase the workload

Why NICE for Transcription

Eliminate third party

costs - transcription

services and

outside counsel

review

Reduces efforts

needed to

transcribe

conversations as

part of the

investigation

process

Transcription

accuracy of 70%+ –

highest in the

industry –

leverages machine

learning and NLP

advanced analytics

Quickly export and

transcribe

communications

ensuring regulatory

deadlines or

business

requirements are

met

Multiple options

available to ensure

limited change to

business processes

NICE Transcription Capabilities for Financial Services

Highest

Transcription

Accuracy

Flexible

Transcription

Options

Automated

Processes

Lower Total

Costs

Reduced Risk

March

Best Practices in Unified Communications Recording

April

Unlocking the Potential of Cloud Compliance Recording

May

Automate eDiscovery and Gain insights with Transcription

Technology

June

Maximize Uptime and Lower TCO with Managed Services

Transforming Communication Compliance Webinar Series