Data Methodology

The data stack — documented, not asserted

Healthcare procurement teams evaluate data vendors differently than general marketers do. They need to understand signal sources, de-identification method, and what PHI protections are actually in place — before they can approve a vendor. This page covers all of it.

Signal Sources

Three categories of de-identified signals

Salubrum's scoring model draws from three signal categories, each de-identified before entering the processing pipeline.

Behavioral Signals

  • De-identified web activity patterns
  • Content consumption at condition level
  • Search-adjacent behavioral indicators
  • Session-depth engagement metrics
  • Condition research journey patterns

Contextual Signals

  • Page-level health content classification
  • Publisher network contextual targeting
  • Condition-specific content verticals
  • Treatment evaluation content patterns
  • Provider comparison research signals

De-identified Claims Data

  • Condition-level utilization patterns
  • Care-seeking journey longitudinal data
  • Treatment cycle timing signals
  • Geographic utilization benchmarks
  • Specialty referral pattern modeling

Scoring Model

How signals become intent scores

Our model normalizes signals across sources and outputs condition-level intent probability rankings as decile scores.

Salubrum data methodology: signal taxonomy and de-identified scoring model architecture

De-identification and Privacy

HIPAA-aware — not HIPAA-certified, because that certification doesn't exist

Salubrum's data pipeline applies HIPAA Safe Harbor de-identification as a structural requirement — not an add-on compliance layer. No PHI enters the scoring model at any point. If a vendor tells you they're "HIPAA certified," that claim has no regulatory basis. The correct question is whether their de-identification meets the HIPAA Safe Harbor standard.

Safe Harbor De-identification

All data is de-identified per the HIPAA Safe Harbor method — removing or generalizing 18 categories of individually identifiable health information before processing. This is a structural requirement, not a toggle.

No PHI Storage or Output

Salubrum does not store, process, or output protected health information at any point in the data pipeline. Our audience segments are probabilistic aggregates, not individual health records.

Annual Review Process

We conduct annual reviews of our data handling and de-identification procedures with privacy counsel. We update our methodology documentation when our practices change.

Full Privacy-Safe Overview