Data Methodology
The data stack — documented, not asserted
Healthcare procurement teams evaluate data vendors differently than general marketers do. They need to understand signal sources, de-identification method, and what PHI protections are actually in place — before they can approve a vendor. This page covers all of it.
Signal Sources
Three categories of de-identified signals
Salubrum's scoring model draws from three signal categories, each de-identified before entering the processing pipeline.
Behavioral Signals
- De-identified web activity patterns
- Content consumption at condition level
- Search-adjacent behavioral indicators
- Session-depth engagement metrics
- Condition research journey patterns
Contextual Signals
- Page-level health content classification
- Publisher network contextual targeting
- Condition-specific content verticals
- Treatment evaluation content patterns
- Provider comparison research signals
De-identified Claims Data
- Condition-level utilization patterns
- Care-seeking journey longitudinal data
- Treatment cycle timing signals
- Geographic utilization benchmarks
- Specialty referral pattern modeling
Scoring Model
How signals become intent scores
Our model normalizes signals across sources and outputs condition-level intent probability rankings as decile scores.
De-identification and Privacy
HIPAA-aware — not HIPAA-certified, because that certification doesn't exist
Salubrum's data pipeline applies HIPAA Safe Harbor de-identification as a structural requirement — not an add-on compliance layer. No PHI enters the scoring model at any point. If a vendor tells you they're "HIPAA certified," that claim has no regulatory basis. The correct question is whether their de-identification meets the HIPAA Safe Harbor standard.
Safe Harbor De-identification
All data is de-identified per the HIPAA Safe Harbor method — removing or generalizing 18 categories of individually identifiable health information before processing. This is a structural requirement, not a toggle.
No PHI Storage or Output
Salubrum does not store, process, or output protected health information at any point in the data pipeline. Our audience segments are probabilistic aggregates, not individual health records.
Annual Review Process
We conduct annual reviews of our data handling and de-identification procedures with privacy counsel. We update our methodology documentation when our practices change.