Research Interests

Machine Learning, Natural Language Processing, Time series modeling and forecasting, Smart Health, Intelligent Assistants

Research Projects

Sensemaking of Online Health Information

Google fields a staggering 1 billion health questions daily, with 89% of Americans consulting Google before their doctors. Yet, challenges persist in accessing and making sense of online health information, especially for those dealing with chronic conditions. Our project's goal is to develop a personalized web app that acts as an intelligent health assistant to personalize and improve online sense-making. This app will empower users to highlight and annotate relevant health information during web searches. It will also facilitate on-the-spot searches and recommendations tailored to health topics, keywords, and information quality. Behind this innovation lies a robust framework of Natural Language Processing (NLP) models and algorithms.
Our NLP solutions include classifiers to distinguish actionable health insights from general facts, health topic extraction, and the characterization of information similarity and divergence. We've curated a dataset of 25,000+ health advice pieces from certified medical sources, spanning 15 chronic diseases. This data drives the development and evaluation of our novel solutions.

Unlocking the Patient Voice: Understanding Treatment Needs for Opioid Use Disorder (OUD) on Social Media

The world of OUD treatment is rapidly evolving, with maintenance medications like extended-release naltrexone, buprenorphine, suboxone, and methadone offering promising solutions. However, this dynamic landscape often leaves patients with unanswered questions and knowledge gaps about their treatment options. Unfortunately, due to stigma, limited access, trust issues, or resource constraints, many individuals turn to social media for peer support and information rather than seeking guidance from healthcare providers. Although peer support is critical for recovery, online discourse on OUD treatment is rife with unverified and divergent information and misinformation. We aim to address these challenges while systematically characterizing and quantifying self-reported treatment needs related to OUD on Reddit. We aim to provide new evidence and actionable insights into patients' treatment needs by harnessing the power of large-scale, real-world patient-reported data. We are developing novel methods leveraging human-LLM interaction, NLP, and qualitative study to unveil patient voices, understand their experience, and unmask the context of their information-seeking.

Improve Decision-making in the ICU through Human-AI Interaction

Data-driven innovation (DDI) is the cornerstone of strategic decision-making in various sectors. For instance, the tech industry leverages data to gain profound insights. Online services employ web usage mining to decode user behavior and fuel business intelligence. In healthcare, the widespread adoption of electronic health records (EHRs) has laid the foundation for DDI. AI has predominantly been applied to enhance patient-focused tasks, from disease prediction to personalized treatment. While this improves bedside decisions, it overlooks broader systemic issues within healthcare delivery.
We focus on deciphering these "system-focused" aspects, specifically how clinicians operate within their local environments and how we can offer meaningful feedback regarding adherence to care standards derived from EHR patterns, termed local standards of care (LSC). Our research focuses on extracting and visualizing LSC using EHR data from intensive care units (ICUs), a data-rich, high-acuity sector of the U.S. healthcare system.

Improving Treatment Adherence Through Personalized Sensing and Prediction

Chronic diseases are responsible for a significant 81% of hospital admissions and 91% of prescribed medications. However, the intricate nature of managing chronic conditions often leads to poor treatment adherence. In response, we're developing a mobile health system to boost adherence by delivering context-aware reminders for health-related tasks like medication intake, exercise, and mealtime.
The challenge lies in medications that impose specific temporal constraints, such as fasting before or after taking a pill or maintaining intervals between medication and sleep. Violating these constraints can have adverse consequences. Our project aims to create a personalized activity prediction model to anticipate health-related tasks and generate context-aware reminders, preventing temporal constraint violations.

Decoding Patient-provider interactions for telemedical triage

Patient-generated messages inherently contain keywords and phrases that convey levels of urgency. We believe that harnessing natural language processing can help us stratify patient messages by urgency, streamline triage processes, and even predict critical clinical outcomes such as emergency department visits or hospital admissions. The primary objective of this project is to evaluate the feasibility of implementing machine learning algorithms for patient message triage.
Our specific goals are as follows: (i) identify specific triggers within patient portal messages, (ii) extract and integrate contextual information (patient age, medical history) from message content and patient data within medical records and EHR data, and (iii) effectively predict tangible clinical outcomes using machine learning algorithms. We have developed an initial solution based on telemedical queries from three online telemedicine platforms. We have assessed the effectiveness of transfer learning techniques for telemedical triage and conducted a comprehensive error analysis, pinpointing challenging telemedicine queries that strain state-of-the-art NLP systems. Moreover, we have made a telemedical query dataset publicly available, labeled for severity classification, specifically for COVID-19 triage

Sarah M. Preum

Research Projects