Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30000

Provide a Mongo script to correct non-existing referenced proxies

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.10
    • Fix Version/s: ADDONS_10.10
    • Component/s: Core MongoDB

      Description

      What
      Due to concurrency issues (refer to NXP-30001), a document can reference in ecm:proxyIds ids of formerly existing proxies that are now deleted.
      The request is to provide a Mongo script to correct data to reflect what should be a consistent/coherent world.

      How to reproduce

      • Take a machine with many CPUs and COREs (e.g. 12 COREs at least)
      • Start a MongoDB instance and allocate e.g. 8 CORES or CPU to MongoDB to maximize concurrency possibilities, e.g.
          docker run --name mongo36 --cpus 8 -v /<DIRECTORY>/<PATH>/d/mongodb-data:/data/db -p 27017:27017 -d mongo:3.6
          
      • Start a hotfixed Nuxeo instance with WebUI and JSF and DAM and SHOWCASE-content and target mongo as a database
      • Create 50 sections
      • Publish the PDF /default-domain/workspaces/Sample%20Content/PDF%20and%20Office%20Documents/Sample%20PDF%20File.pdf in the 50 sections
        This will thus create 50 proxies targeting 1 single document
      • Run the following command to delete the 50 proxies with maximum velocity and concurrency
          curl -u Administrator:Administrator -X GET "http://<NUXEO>:8080/nuxeo/api/v1/search/lang/NXQL/execute?query=SELECT%20*%20FROM%20Document%20WHERE%20ecm%3AisProxy%3D1" -H "properties: common" -H "Content-Type: application/json" -H "Accept: application/json"|jq ".|.entries[] | (.uid)"  |sed -e 's/"//g'|while read i ; do
                # echo $i & 
                curl -u Administrator:Administrator -X DELETE "http://<NUXEO>:8080/nuxeo/api/v1/id/$i" &
          done
          
      • You will get many error statements of error about ConcurrentUpdateException, and only some few proxies will effectively be deleted
      • If you retrieve the ID of the original version target document (here 20b85394-168c-4cc8-8ba3-395b53879937):
        • First attempt to delete it:
          	 curl -u Administrator:Administrator -X DELETE "http://<NUXEO>:8080/nuxeo/api/v1/id/20b85394-168c-4cc8-8ba3-395b53879937"
          	 

          you will get a 403 due to a proxy targeting the document (this is the MAIN consequence of the issue to solve)

        • When going in MongoDB to look at the target document:
               db=db.getSiblingDB("nuxeo")
               coll=db.getCollection("default")
               coll2 = coll.find({"ecm:id":"20b85394-168c-4cc8-8ba3-395b53879937"}).toArray()
          	 

          When adapting the output to be "jq-compatible", you will get an output such as in the attached file mongo_output_for_20b85394-168c-4cc8-8ba3-395b53879937

        • When analyzing this file with the script:
          cat mongo_output_for_20b85394-168c-4cc8-8ba3-395b53879937 |jq ".|(.\"ecm:proxyIds\")" |sed -e 's/"//' -e 's/",//' -e "s/.*\[//" -e 's/]//'|while read i ; do
                  if [ "$i" != "" ]; then
                          echo -n "Proxy Id : "$i" "
                          curl -s -u Administrator:Administrator -X GET "http://<NUXEO>:8080/nuxeo/api/v1/id/$i" | grep "status.:404"
                          echo
                  fi
          done
          

          you will see that some referenced proxy ids do not match any existing proxy, e.g.

          	 [...]
          	 Proxy Id : a44ede3f-7bdc-46e1-8663-4d7e7182915b
                   Proxy Id : f2d0c454-79e7-4a6a-8481-e68d59c946a3 {"entity-type":"exception","status":404,"message":"f2d0c454-79e7-4a6a-8481-e68d59c946a3"}
          	 [...]
               

          where, here, the first proxy exists and the second does not exist.
          Consequence
          As at least one proxy is still referenced in the target document, the target document cannot be deleted.

      REQUEST
      Provide a script able to restore consistency in the database:

      • retrieve all documents having proxies referenced
      • for each, check proxy existence and eventually fix data if the proxy referred to does not exist anymore

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pabgrall Patrick Abgrall
                Reporter:
                pabgrall Patrick Abgrall
                Participants:
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 days
                  2d

                    PagerDuty

                    Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.