-
Type: Epic
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: ADDONS_9.10, 10.1
-
Component/s: Audit
-
Tags:
Reasons for a Double Audit Backend
We currently have 2 Audit storage backend:
- SQL
- storage is ACID
- query options are limited
- heavy write can hammer the database
- ES
- distributed / eventually consistent storage
- lot of query options
ES backend is now the default storage so that we can have better query and better analytics.
However, having Audit stored in ES implies some significant constraint at the ES level:
- ES must have a full and consistent backup
- ES Audit index cannot be rebuilt
This is a constraint that is an issue for some deployments where ES storage is regarded as "disposable" because "it can be rebuilt".
Double Backend
The idea behing the Audit backend is to leverage 2 different storages for difference usage:
- store in SQL for backup and consistency reasons
- store in ES for fast search
In order to not be tied to a specific repository backend, we can also leverage the Nuxeo Directory abstraction so that we can use SQL or NoSQL Database.
Since this first storage is not build for query or for speed, we can use a very simple Directory Schema, storing the Audit entry payload as a JSON String.
- this is fast for storage
- this is fast for retrieval / reindexing
- we do not need to do any query on payload at this level (ES is here for that)
This double backend must be careful about storage management and since there no such thing as XA between Nuxeo Directory and ES, we must emulate the safest option:
- write first in the Directory + Commit
- write then in ES
This way:
- we are sure that Audit log is safe, at least at the Directory level
- we can always rebuild the ES index to fix consistency
Reindexing
We need an operation that can start the Directory => ES rebuild.
Integration and Migration.
This should be integrated inside nuxeo-elasticsearch-audit.
Ideally, we would like this to become the default setting, but for that we need to be able to handle the migration that is actually the reverse operation of Reindexing.
This migration could be started automatically and asynchronously:
- activate the new Double backend
- dump ES => SQL
- then flush ES index and do SQL => ES