-
Type: Sub-task
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 5.9.3
-
Component/s: Elasticsearch, Query & PageProvider
-
Sprint:Sprint 1(5.9.3)
The goal is to index Nuxeo Documents inside elasticsearch.
This task is about "on the fly" indexing : full reindexing will be handled in an other task.
Indexing Strategy
We we need to (re)index a Document each time a Document is created/modified/moved.
transactions handling
elasticsearch is not transactional, so we must handle error scenarios.
For that, we need to :
- write in elasticsearch only after the Nuxeo Repository commit
- use a persistent indexing job that can be restarted until elasticsearch updates is successful
listeners
We may need to use each types of listeners available in Nuxeo platform
inline synchronous listeter
Catch update events and flag Document for indexing or recursive indexing.
This listener can also flag the Document to decide if it must be indexed in a synchronous or asynchronous way.
Having this listener is not strictly required, but this can be seen as an optimization :
- can compute "what has been changed"
- can prepare the work for the next listener
- can define strategy for async/sync indexing
post-commit sync listener
In some cases, we want the elasticsearch index to be updated synchronously : typically when the indexing request comes from a direct user interaction.
- we can not do it via a sync listener : this won't be efficient and won't handle the rollback case
- we can not make it fully async : because users won't understand
async listener / worker
The async worker should be persistent and handle :
- simple async indexing tasks
- recursive reindexing tasks
Questions
Should we keep the PostCommitSyncListener since we already have a Transaction Synchronizer, we could do the sync job in that context and avoid the Listener.