# Integration Pattern: MDM Hub-and-Spoke with Operational Writeback and Analytical Replication
## Pattern Name
**MDM Hub-and-Spoke with Operational Writeback and Analytical Replication** _Also known as: Consolidation-Coexistence MDM Pattern, Golden Record Publish Pattern_
---
## Intent
Ingest entity records from multiple operational source systems into a central MDM hub, produce a reconciled golden record through match/merge and survivorship, then publish that master record back to operational consumers via an integration broker and simultaneously replicate it to an analytical data warehouse — establishing a single authoritative source of truth for key business entities across the enterprise.
---
## Motivation
In enterprises with multiple operational systems (Billing, CRM, ERP, etc.), the same real-world entity — a customer, organization, or product — exists as independent, often inconsistent records in each system. No single system owns the complete, trusted picture. This pattern centralizes identity resolution and data quality in a dedicated MDM platform, then propagates the authoritative record outward so that every consumer — operational or analytical — works from the same master.
---
## Applicability
Use this pattern when:
- The same entity (customer, organization, product) exists in multiple source systems with overlapping but inconsistent representations
- Downstream operational systems need to consume a trusted, deduplicated master record
- Analytics requires a reliable master entity layer for cross-system joins and reporting
- The organization has distinct integration needs for operational writeback (low-latency, API-driven) vs. analytical replication (batch, bulk)
- Data stewardship workflows are required to manage match exceptions and override survivorship decisions
---
## Participants
|Participant|Role|Technology (Example Implementation)|
|---|---|---|
|Source Systems|Contribute raw entity records to MDM hub|Billing system, CRM (e.g., Salesforce)|
|Inbound Integration Layer|Extract and stage source records into MDM|Informatica PowerCenter / IDMC pipelines|
|MDM Hub|Match, merge, survivorship, golden record production|Informatica MDM Hub (deployed as Enteworks)|
|Message Broker|Decouples golden record publish events from downstream consumers|RabbitMQ (hosted on Enterworks platform)|
|Outbound Integration — Operational|Subscribes to golden record events, maps fields, writes back to source systems|Dell Boomi (iPaaS / ESB)|
|Outbound Replication — Analytical|Replicate golden record tables to warehouse|Fivetran|
|Analytical Data Warehouse|Consume and join on master entity records|Snowflake|
|Data Stewardship UI|Human review of match exceptions and merges|Informatica MDM Data Director (or equivalent)|
---
## Structure
```
Billing System CRM System
│ │
│ Batch feed / │ Batch feed /
│ API extract │ API extract
▼ ▼
Informatica Pipelines (Inbound)
│
│ Staged contributor records
▼
┌─────────────────────────────────────┐
│ MDM Hub (Enteworks) │
│ │
│ Source Records (per system) │
│ │ │
│ ▼ │
│ Match Engine │
│ (probabilistic / deterministic) │
│ │ │
│ ▼ │
│ Merge & Survivorship │
│ (attribute-level source priority) │
│ │ │
│ ▼ │
│ Golden Record (BVT) │
└───────────────┬─────────────────────┘
│
┌───────┴──────────────┐
│ │
▼ ▼
RabbitMQ Fivetran
(on Enterworks) (Analytical
Exchange / Queue Replication)
│ │
│ Event subscribe ▼
▼ Snowflake
Dell Boomi (Master Entity
(Operational Layer)
Writeback)
│
┌────┴────┐
▼ ▼
Billing CRM
System System
```
---
## Flow
### Inbound — Source Contribution
1. **Extract** — Informatica pipelines pull entity records from Billing and CRM on a scheduled basis (or event-triggered for near-real-time coexistence)
2. **Stage** — Records land in MDM Hub staging tables as **source contributor records**, tagged by source system and load timestamp; raw source values preserved
3. **Tokenize / pre-process** — Standardization routines applied (name parsing, address normalization, phone formatting) to improve match accuracy
### Match & Merge
4. **Match** — MDM Hub runs configured match rules (deterministic on key identifiers, probabilistic on name/address/contact attributes) to identify candidate duplicate pairs across and within source systems
5. **Review** — High-confidence matches auto-merge; low-confidence pairs routed to data stewardship queue for human review via MDM Data Director UI
6. **Merge** — Matched records consolidated into a single **golden record (Best Version of Truth)**
7. **Survivorship** — Attribute-level survivorship rules determine which source system's value populates each golden record field (e.g., CRM wins on contact name, Billing wins on account number and payment terms)
### Outbound — Operational Writeback (RabbitMQ → Boomi)
8. **Publish event** — MDM Hub publishes a golden record create/update event to a RabbitMQ exchange hosted on the Enterworks platform; MDM fires and forgets, with no direct coupling to downstream consumers
9. **Queue routing** — RabbitMQ routes the event to the appropriate queue(s); exchange/binding configuration determines which queues receive which event types (e.g., customer create vs. update vs. merge)
10. **Boomi consumption** — Dell Boomi listens as a queue subscriber, consuming events as they arrive; applies per-target field mapping and transformation, and writes updated master records back to Billing and CRM via their respective APIs or file interfaces
11. **Cross-reference sync** — MDM Hub's cross-reference table (source key ↔ golden record ID) updated to maintain bidirectional traceability
### Outbound — Analytical Replication (Fivetran)
11. **Fivetran sync** — Fivetran replicates MDM Hub golden record tables to Snowflake on a scheduled interval (typically daily or near-real-time depending on connector config)
12. **Snowflake master layer** — Golden records land in a dedicated master entity schema in Snowflake, serving as a trusted join key for cross-system analytical queries (e.g., unifying Billing revenue with CRM activity at the customer level)
---
## Tradeoffs
### Advantages
- **Event-driven decoupling** — RabbitMQ between MDM and Boomi means MDM fires and forgets; Boomi consumers can be added, removed, or restarted without touching MDM publish logic
- **Single source of truth** — All consumers — operational and analytical — work from the same reconciled master entity
- **Decoupled identity resolution** — Match/merge logic lives in MDM, not scattered across individual application integrations
- **Bidirectional traceability** — Cross-reference table preserves the link between golden record and every contributing source record
- **Separation of integration concerns** — Boomi handles operational writeback (transformation, routing, API complexity); Fivetran handles bulk analytical replication — each tool doing what it's optimized for
- **Stewardship workflow** — Human oversight of ambiguous matches prevents silent data quality degradation
### Limitations
- **Operational complexity** — Four integration layers (Informatica inbound, RabbitMQ broker, Boomi outbound, Fivetran replication) means four failure surfaces to monitor and maintain
- **Latency in coexistence** — Source systems receive updated master records on a delay; real-time coexistence requires additional architectural investment (event streaming, API-first MDM)
- **Survivorship rigidity** — Static attribute-priority rules can produce counterintuitive golden records when source system data quality shifts; rules require ongoing tuning
- **Match rule fragility** — Probabilistic matching requires calibration; poorly tuned rules generate false merges (undoing them — "unmerge" — is costly) or excessive stewardship queue volume
- **Snowflake lag** — Fivetran replication cadence means analytics golden records may trail operational MDM state; not suitable for real-time analytical use cases
- **Hub dependency** — All cross-system entity resolution flows through MDM; hub unavailability blocks inbound updates and outbound publishes
---
## MDM Style Classification
This implementation combines two MDM styles:
|Style|Description|Applied Here|
|---|---|---|
|**Consolidation**|Source systems remain authoritative; MDM produces a read-only golden record for analytics|Fivetran → Snowflake path|
|**Coexistence**|MDM golden record is published back to source systems, which may adopt it selectively|Boomi → Billing / CRM path|
A full **Registry** or **Transactional** MDM style would require source systems to route all creates/updates through MDM first — a higher maturity commitment not present in this pattern.
---
## Related Patterns
- **Batch API-to-Warehouse ELT** — Simpler unidirectional pattern without identity resolution; often feeds raw source data that MDM then reconciles
- **Event-Driven MDM Publish** — Streaming alternative to batch Boomi writeback, using Kafka or similar for lower-latency golden record propagation
- **Canonical Data Model Integration** — Complementary pattern; defines the shared entity schema that MDM golden records and Boomi mappings conform to
- **Reverse ETL** — Related concept for pushing warehouse-derived insights back to operational systems; distinct from MDM writeback in that it carries analytical outputs, not master records
---
## Implementation Notes
- Maintain a **cross-reference table** (source system + source key ↔ golden record ID) as a first-class artifact — it is the connective tissue of the entire pattern
- Version golden records with an effective date range to support slowly changing dimension (SCD) semantics in Snowflake
- Instrument Boomi flows with dead-letter queues and alerting — silent writeback failures will cause source systems to drift from MDM truth without anyone noticing
- Monitor RabbitMQ queue depth as an operational health signal — a growing queue means Boomi consumers are falling behind or have stopped processing
- Design RabbitMQ exchange/binding topology to support multiple independent consumers per event type — this allows future consumers (additional target systems) to be added without modifying MDM publish logic
- Design survivorship rules explicitly per attribute, per source — document them; they encode business decisions that will otherwise be rediscovered painfully during audits
- Establish a **data stewardship SLA** for the match review queue — an unworked queue is a backlog of unresolved duplicates accumulating in production
- In Snowflake, expose golden records as views that join the master entity table with the cross-reference table — downstream analysts can then trace any golden record back to its contributing source keys