Dictation application

Report
Mobile Dictation
With Automatic Speech Recognition
for Healthcare Purposes
Tuuli Keskinen, Aleksi Melto, Jaakko Hakulinen, Markku Turunen, Santeri Saarinen, Tamás Pallos
TAUCHI research center, School of Information Sciences, University of Tampere, Finland
Riitta Danielsson-Ojala, Sanna Salanterä
Department of Nursing Sciences, Faculty of Medicine, University of Turku, Finland
Kites symposium 2013
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
Background
• First speech recognition systems for medical
reporting were developed over 20 years ago [1]
• Doctors’ dictations are still commonly typed
manually, but utilization of speech recognition
is increasing especially in radiology and
pathology
• Nurses’ use of speech recognition is rare and
often limited to filling the templates
[X] Numbers refer to the actual references in the paper.
Background
• Utilizing speech recognition in Finnish healthcare
studied, e.g., in [2] where radiologists were
followed changing from cassette-based recording
to speech recognition based dictating
• Several studies in the area of speech recognition
in healthcare done, e.g., [1, 3, 4, 5]
• Previous studies focus mainly on objective
qualities, such as dictation durations and
recognition error rates
Motivation
”Voi kun meillä
olisi
mahdollisuus
saneluun!”
- Anonyymi YTHS:n sairaanhoitaja
Motivation for our study
• Paucity of utilizing speech recognition in
Finnish healhcare, especially in nursing
• Obvious and unnecessary delays in getting
patient information to the next treatment
steps
• Lack of research focusing on the user
expectations and experiences of dictation
applications utilizing speech recognition in
healthcare
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
Dictation application
• Based on ”MobiDic” by Turunen et al. [6]
• The mobile client (Android application on a
tablet) includes functionality for recording and
editing dictations, and modifying the dictation
texts
• The server side manages the dictations (audio
and text) and communicates with speech
recognition engines and M-Files document
management system
Dictation application
• Not only speech recognition is utilized, but a variety of
other tools is included to improve results:
– State of the art natural processing tools (e.g., spelling and
grammar checking)
– Statistics based on user actions
– Optimized multimodal touch-screen U
• Distributed application model makes a variety of use cases
possible:
– Real-time distributed assisted dictation
– Workflow management
– Plug-and-play component management (e.g., speech recognizer,
NLP tools, document management)
– UI can be adapted for different usage cases and devices
Dictation application
Dictation application – v2.0
Dictation application – v2.0
Dictation application – v2.0
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
User evaluation
• Real-world context, real users and real dictations
• Two wound care nurses in one of the University
Hospitals in Finland
• Lasted three months in total, covering 30 and 67
dictations for the participants
• Wizard-of-Oz approach
– The medical language model available was based on
medical and nursing documentation, and thus, it was
not sufficient to recognize the language used by the
wound care nurses
Dictation application
Methodology
• Background interview
– Main focus on participants’ normal practices on
making and/or dictating patient entries
• Subjective data gathered with questionnaires
– User expectations and experiences (SUXES [8])
– Usability-related experiences (SUS [9])
– Open questions
• Log data
– All application and server events logged
SUXES method
• Enables comparison between user expectations before the usage
and user experiences after the usage on a set of statements
• Expectations reported by giving two values
– acceptable level: the lowest acceptable quality level for even using the
system (or property)
– desired level: the uppermost level that can even be expected of the
system (or property)
• Experiences reported by giving a single value on the same
statements
• Expectations form a gap where the experienced level is usually
expected to be
– If below  something is wrong; If above  success
SUXES method
• Expectations
Using the phone is fast.
Low
High
x
x
• Experiences
Using the phone is fast.
• Comparison
Using the phone is fast.
x
SUXES method
• Expectations
Using the phone is fast.
Low
High
x
x
• Experiences
Using the phone is fast.
• Comparison
Using the phone is fast.
x
SUXES method
• Expectations
Using the phone is fast.
Low
High
x
x
• Experiences
Using the phone is fast.
• Comparison
Using the phone is fast.
x
Expectations and experiences
• We used the nine original statements of SUXES
– speed, pleasantness, clearness, error free use,
error free function, learning curve, naturalness,
usefulness, and future use
• …and five additional statements comparing
the dictation application to the normally used
entry practice
– faster, more pleasant, more clear, easier, and
prefer in the future
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
User expectations on the application
Median responses of acceptable – desired levels (grey areas), n=2.
User experiences on the application
Median responses of acceptable – desired levels (grey areas)
and experiences (black circles), n=2. P1 and P2 refer to participant 1 and 2.
User expectations compared to normal
entry practice
Median responses of acceptable – desired levels (grey areas), n=2.
User experiences compared to normal
entry practice
Median responses of acceptable – desired levels (grey areas)
and experiences (black circles), n=2.
Content
•
•
•
•
•
•
Background & Motivation
Dictation application
User evaluation
Results
Discussion and conclusion
Ending words
Discussion
• The desired level was 6 or 7 on all statements
• The experienced level was at least 6 on all but
one statements
• The usefulness of the dictation application can
clearly be seen in the results
• More importantly, the participants would
prefer using the application in the future, i.e.,
they would be ready to drop their familiar and
safe routines
Conclusion
• Due to not having an accurate enough
language model for nurses’ purposes, we used
a Wizard-of-Oz scenario to finalize the speech
recognition results
• The user experience results show a true
potential for our dictation application – not
only to smoothen dictation process, but as a
relevant option for writing the nursing entries
Future work
• Finalizing a language model for nurses and
utilizing it in Finnish healthcare to enable totally
automatic dictation-to-text process is crucial
• We are not developing the language models by
ourselves, but will be in close collaboration with
our partners in the development and evaluation
• We are also developing our application further to
provide even more pleasurable user experience
and seamless process
Future Work
• In order to make this reality, we need a proper
process for iterative deployment: not a standalone product which can be sold to hospitals, for
example
• We have developed all necessary components:
client and backend software, connections to 3rd
party components, tools to support deployment,
and a complete deployment process
• Ready for commercialization – looking for
partners!
Global market
Global market
Acknowledgements
• Project ”Mobile and Ubiquitous Dictation and
Communication Application for Medical
Purposes” (”MOBSTER”)
• Funded by the Finnish Agency for Technology
and Innovation (TEKES)
• Lingsoft and M-Files, and other project
partners

similar documents