[NXP-17180] Fix cluster invalidation management - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 6.0
Fix Version/s: 6.0-HF13, 7.3
Component/s: Clustering, Core VCS

Impact type:

Configuration Change
Upgrade notes:

Hide

When enabling clustering, it's now highly recommended to set an explicit cluster node id in the configuration using

repository.clustering.id=12345

(the id can be a string on Oracle, but must be an integer on other databases)

Furthermore, if a cluster node crashes, cleaning the cluster_nodes and cluster_invals tables must be done by hand as Nuxeo has no way of knowing what nodes are currently connected or not.

If a cluster node id is not configured in nuxeo.conf through the repository.clustering.id property, at startup the following WARN will be emitted:

WARN [ClusterNodeHandler] Missing cluster node id configuration, please define it explicitly (usually through repository.clustering.id). Using random cluster node id instead: 26938

The cluster node id should be defined explicitly through repository.clustering.id because this way it's stable, known to the server admin, and has no risk of collision with another randomly-generated one. On the other hand the server admin has to be careful to change it on each cluster node instance.

Show
When enabling clustering, it's now highly recommended to set an explicit cluster node id in the configuration using repository.clustering.id=12345 (the id can be a string on Oracle, but must be an integer on other databases) Furthermore, if a cluster node crashes, cleaning the cluster_nodes and cluster_invals tables must be done by hand as Nuxeo has no way of knowing what nodes are currently connected or not. If a cluster node id is not configured in nuxeo.conf through the repository.clustering.id property, at startup the following WARN will be emitted: WARN [ClusterNodeHandler] Missing cluster node id configuration, please define it explicitly (usually through repository.clustering.id). Using random cluster node id instead: 26938 The cluster node id should be defined explicitly through repository.clustering.id because this way it's stable, known to the server admin, and has no risk of collision with another randomly-generated one. On the other hand the server admin has to be careful to change it on each cluster node instance.

Description

Cluster invalidation propagation is broken, because several database backend rely on the cluster node handler connection id (pid/sid depending on the database) to be constant to act as a stable identifier while the Nuxeo instance is up, which is not the case since connections now systematically go through a pool (~~NXP-14142~~).

This applies for all clustering-supported databases (PostgreSQL, MySQL, SQL Server) except for Oracle which uses a different cluster node id strategy.

The user-visible aspect of this bug is that changes done on one Nuxeo instance sometimes take a long time (could be minutes our hours) to be visible one or more other nodes.

Attachments

Issue Links

depends on

NXP-14142 replace connection proxies logic in single datasource mode

Resolved

Activity

People

Assignee:

Florent Guillaume

Reporter:

Florent Guillaume

Participants:

Florent Guillaume, Jenkins

Votes:

0 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

2015-06-01 12:32

Updated:

2015-06-10 03:41

Resolved:

2015-06-01 21:10