-
Type: Task
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Component/s: Core MongoDB
-
Sprint:nxFG 11.1.13
-
Story Points:3
For Nuxeo MongoDB databases created before NXP-27654 it's been observed that some documents were created with identical document ids (ecm:id in MongoDB).
We need some diagnostic scripts to solve this, and add the unique index when it's possible.
The first script, mongodb-create-unique-index.js, should be run first. It will either tell you:
- that everything is ok.
- that there was no unique index but that the script created the index ok.
- that there are duplicate ids that must be further resolved by the second script.
The second script, mongodb-find-duplicate-keys.js, will find duplicate keys and remove the ones that are safe (strictly identical documents otherwise). It will otherwise by default display the full documents where a duplicate key occurs, so that further manual action can be used to remove them (after choosing the relevant one) (this can be disabled if too verbose by editing the script to set var SHOW_UNRESOLVED = true).
Note that this second script is in "dry run" mode by default (which can be used for diagnostics), it must be edited to set var DRY_RUN = false if actual changes are to be made.
Example output for the first script:
Using nuxeo.default Unique index on ecm:id already present
Using nuxeo.default Starting scan for duplicate ids... Dropping previous index on ecm:id... Done Creating unique index on ecm:id... Done
Using nuxeo.default Starting scan for duplicate ids... Collection has duplicates, the first one is ecm:id = 76fc611b-120b-454c-91f4-a3aaed5189b2 Unique index not created
Example output for the second script. Note that only documents with unresolved duplicate ids are shown, not all the duplicate ids:
DRY RUN no modifications will be done Using nuxeo.default Collection has 10 documents Starting scan for duplicate ids... Showing unresolved duplicate ids ecm:id = 76fc611b-120b-454c-91f4-a3aaed5189b2 { "_id" : ObjectId("5e83a12e6ff3932209b7a086"), "ecm:id" : "76fc611b-120b-454c-91f4-a3aaed5189b2", "foo" : { "bar" : "baz" } } { "_id" : ObjectId("5e83a12e6ff3932209b7a089"), "ecm:id" : "76fc611b-120b-454c-91f4-a3aaed5189b2", "foo" : { "bar" : "moo" } } Collection has 3 duplicate ids Collection has 10 documents impacted by duplicates Collection has 6 identical documents that were removed Collection has 2 resolved duplicate ids Collection has 1 unresolved duplicate ids
Once this returns Collection has 0 unresolved duplicate ids then the database is in a state where the first script can be run again and will be able to create the unique index.
- is related to
-
NXP-27654 Add a unique index on ecm:id in MongoDB
- Resolved