“We need AI to measure patient mood.”
This was the directive I received when joining HealthRhythms as VP of Engineering & Data Science. Like many companies, they knew AI could transform their business but weren’t sure exactly how. Within 4 months, we doubled our usable data, increased our core prediction accuracy by 11%, and cut model iteration time in half.
Here’s how we did it.
The Challenge: Beyond Simple Measurement
In mental health, measurement is broken. The standard tool is the PHQ-9, a nine-question survey about depression symptoms over the past two weeks. It has serious limitations:
- Recall bias: Can you accurately remember how you felt two weeks ago?
- Recency bias: Today’s mood colors your memory of the past
- Intermittent data: Only collected during appointments
- Poor adherence: Patients rarely complete surveys consistently
The company wanted to use AI to solve this. But “use AI” isn’t a strategy.
Finding the Real Value Proposition
After diving into the data, we discovered something crucial: 80-90% of patients have stable PHQ-9 scores month-to-month. The real opportunity wasn’t in replacing the PHQ-9 entirely—it was in identifying the minority of patients whose conditions were changing significantly.
This realization transformed our approach. Instead of building a general “mood predictor,” we focused on a specific, valuable goal: outperforming the last PHQ-9 score for unstable patients. Even better, our advantage grew with time since the last survey—exactly when clinicians needed us most.
Building the Machine (Learning)
With a clear direction, we faced our next challenge: slow experimentation velocity. Our data scientists and engineers were working in silos with disconnected processes. We needed infrastructure.
We built:
- Standardized model pipelines using Databricks and MLflow
- A configuration system for flexible experimentation
- A feature store for reliable data processing
- Robust serving infrastructure
This foundation let us iterate faster and more reliably.
The Data Quality Feedback Loop
As we improved our models, we hit another wall: data quality. Our historical data came from various academic studies with different collection methods. We needed consistency.
We created:
- Clear criteria for “usable” data
- Dashboards tracking data quality metrics
- A prioritization framework for data collection
This clarity helped product and leadership teams make better decisions about data collection, creating a virtuous cycle of improvement.
Results That Matter
In just four months, we:
- Doubled our usable data volume
- Increased our core prediction accuracy by 11%
- Cut model iteration time in half
More importantly, we transformed from “we need AI” to having clear metrics, processes, and infrastructure for continuous improvement.
Key Lessons
-
Start with Value: Don’t build an AI solution looking for a problem. Understand where AI can provide unique value beyond existing solutions.
-
Make it Measurable: Transform vague goals (“measure mood”) into specific metrics (“outperform last PHQ-9 for unstable patients”).
-
Build for Speed: Invest in infrastructure that lets your team iterate quickly. The faster you can experiment, the faster you’ll find what works.
-
Close the Loop: Create clear feedback mechanisms between your AI efforts and the rest of the business. Make it easy for everyone to contribute to improvement.
The path from AI ambitions to actual results isn’t always straight. But with clear goals, good infrastructure, and measurable feedback loops, you can transform vague aspirations into real business value.