DataCite Commons: Lessons Learned from a Secondary Analysis Using Natural Language Processing and Machine Learning from a Lifestyle Intervention

This poster was presented on 2022 April 6 – 9 at the 43rd Annual Meeting & Scientific Sessions of the Society of Behavioral Medicine in Baltimore, MD, USA. We provide the poster in several formats, including svg, pptx, png, pdf, and jpg. We also provide two figures: the "iceberg figure" that illustrates the depth of the untapped data from the original LIvES study (iceberg_figure.png), as well as the QR code that links to additional information such as references (QR - Handout.png). --- Submitted Abstract: Background: Recorded telephone coaching sessions (approximately 24,500) in English and Spanish from 1205 women participating in the Lifestyle Intervention for oVarian cancer Enhanced Survival (LIvES), GOG 0225, study were used for this analysis. The LIvES Study tested whether a lifestyle intervention of increased physical activity and a healthy diet would increase progression-free survival compared to an attention control using trained health coaches and Motivational Interviewing (MI), a directive, patient-centered counseling approach; 323 LIvES Study coaching session recordings were scored for adherence to MI techniques. Here we describe lessons learned from a secondary analysis of LIvES data utilizing machine learning and natural language processing to automate fidelity and predict lifestyle behavioral outcomes. Methods: Numerous steps were necessary to prepare the call recordings for natural language processing. Data were aligned through a combination of participant phone numbers, coach names and participant names, entry dates and recording dates. Transcription was performed automatically with wav2vec. An annotation interface was developed using Label Studio and an annotation guideline was adapted from existing Motivational Interviewing Treatment Integrity (MITI) 3.0. Finally, a pilot annotation of the call recordings was completed and initial inter-rater reliability was measured. Results: The process of preparing this secondary analysis resulted in a number of lessons learned. First, data infrastructure for the original LIvES study, due to its long-running nature, evolved in ways that lost data continuity. The data alignment process would have been simplified by establishing a single identifier to link calls, outcomes, and MITI scores, and maintaining that identifier over the course of the project. Second, evaluating the quality of automated transcription systems is difficult and could have bee...

Poster published 2022 in ReDATA

Image

https://doi.org/10.25422/azu.data.19576069

Sarah Freylersythe
Rebecca Sharp
John Culnan
Damian Yukio Romero Diaz	University of Arizona
Yiyun Zhao
Hagan Franks
Remo Nitschke
Steven J. Bethard
Tracy E. Crane

Sarah Freylersythe
Rebecca Sharp
John Culnan
Damian Yukio Romero Diaz	University of Arizona
Yiyun Zhao
Hagan Franks
Remo Nitschke
Steven J. Bethard
Tracy E. Crane

Lessons Learned from a Secondary Analysis Using Natural Language Processing and Machine Learning from a Lifestyle Intervention

Cite as

Download Reports

Lessons Learned from a Secondary Analysis Using Natural Language Processing and Machine Learning from a Lifestyle Intervention

Cite as

Download Reports

Lessons Learned from a Secondary Analysis Using Natural Language Processing and Machine Learning from a Lifestyle Intervention

Cite as

Download Reports

Share

Lessons Learned from a Secondary Analysis Using Natural Language Processing and Machine Learning from a Lifestyle Intervention

Cite as

Download Reports

Share