Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-17276

Optimize task-related SQL queries

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.0-HF15, 7.4
    • Component/s: Core

      Description

      A number of task-related queries on multi-valued properties use the old-style NXQL syntax like nt:actors = 'foo' (which uses an EXISTS at the SQL level) instead of the more efficient nt:actors/* = 'foo' (which uses a JOIN).

      These queries should be converted to the more efficient new style.

      Original report:

      Many NXQL queries are translated to SQL queries using keyword EXISTS.
      In big databases it can work slow, and it depends on planners.
      It is recomended to use JOIN instead.
      For instance in postgres, the "user_tasks" page providers is translated to:

      SELECT "hierarchy"."id" AS "_C1", "hierarchy"."primarytype", "hierarchy"."mixintypes", ARRAY['Task']::varchar[] <@ "hierarchy"."mixintypes"
      FROM "hierarchy" 
           left join "misc" "_F1" ON "hierarchy"."id" = "_F1"."id" 
      WHERE (
        ( "hierarchy"."primarytype" IN ('SectionRoot', 'Workspace', 'SocialSection', 'Msds', 'TemplateSource', 'RelationSearch', (...) ) ) 
        AND 
        ( "_F1"."lifecyclestate" NOT IN ('ended', 'cancelled') ) 
        AND 
        ( "hierarchy"."primarytype" IN ('TaskDoc') OR (ARRAY['Task']::varchar[] <@ "hierarchy"."mixintypes") ) 
        AND 
        (
           (EXISTS (SELECT 1 FROM "nt_actors" "_F2" WHERE "hierarchy"."id" = "_F2"."id" AND "_F2"."item" IN ('smith','user:smith','ABC','group:ABC','CDE','group:CDE','administrators','group:administrators','members','group:members',(...) ))) 
           OR 
           (EXISTS (SELECT 1 FROM "nt_delegatedactors" "_F3" WHERE "hierarchy"."id" = "_F3"."id" AND "_F3"."item" IN ('smith','user:smith','ABC','group:ABC','CDE','group:CDE','administrators','group:administrators','members','group:members',(...) )))
        )
      )	 
      

      The timing for this query is showed here http://explain.depesz.com/s/uEiL
      and it takes about 10 seconds.

      EXISTS can be easly converted to JOIN:

      SELECT "hierarchy"."id" AS "_C1", "hierarchy"."primarytype", "hierarchy"."mixintypes", ARRAY['Task']::varchar[] <@ "hierarchy"."mixintypes"
      FROM "hierarchy" 
           left join "misc" "_F1" ON "hierarchy"."id" = "_F1"."id" 
           LEFT JOIN "nt_actors" "_F2" ON "hierarchy"."id" = "_F2"."id"  
           LEFT JOIN "nt_delegatedactors" "_F3" ON "hierarchy"."id" = "_F3"."id"
      WHERE (
        ( "hierarchy"."primarytype" IN ('SectionRoot', 'Workspace', 'SocialSection', 'Msds', 'TemplateSource', 'RelationSearch', (...) ) ) 
        AND 
        ( "_F1"."lifecyclestate" NOT IN ('ended', 'cancelled') ) 
        AND 
        ( "hierarchy"."primarytype" IN ('TaskDoc') OR (ARRAY['Task']::varchar[] <@ "hierarchy"."mixintypes") ) 
        AND 
        (
           ("_F2"."item" IN ('smith','user:smith','ABC','group:ABC','CDE','group:CDE','administrators','group:administrators','members','group:members',(...) ) 
           OR 
           ("_F3"."item" IN ('smith','user:smith','ABC','group:ABC','CDE','group:CDE','administrators','group:administrators','members','group:members',(...) )
        )
      )	 
      

      In this case, the query takes 500ms.
      http://explain.depesz.com/s/7iA

      The problem is related to databases planners.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: