-
Type: Task
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: ADDONS_9.10
-
Component/s: Clustering
-
Epic Link:
Context
We consider 2 Nuxeo clusters deployed in 2 different regions.
For latency reasons, we can not spread one cluster across the 2 regions, so we want to replicate the data between an Active and a Passive cluster
Principles
We want to leverage the Kafka integration to manage the replication between the 2 Nuxeo clusters:
- put all info/data/operation to be replicated in Kafka
- nuxeo-stream or Kafka Connect
- replicate Kafka topics between data centers
- Kafka MirrorMaker
- replay / reintegrate data on the passive cluster
- Kafka Connect
The high-level architecture diagram looks like this
Data to replicate
There are several types of data we want to replicate:
- Document Repository
- here we focus on DBS implementation on MongoDB
- Audit data
- MongoDB + ES data
- Indexes and Sequences
- MongoDB and ES data
- Work
- we focus on the StreamWorkManager and Kafka data
- Blobs
The blob part is likely to be dependant on the deployment infrastructure:
- AWS S3 has built-in replication
- SAN / NAS infrastructure on multi-DC usually include replication