Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-17198

speedup postgres function

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Postponed
    • Component/s: Core

      Description

      The current code of the getInTreeSql method in the DialectPostgreSQL class is very slow.
      For instance in my database analysis looks like this:

      Sort (cost=8677.93..8677.96 rows=10 width=59) (actual time=11759.726..11759.823 rows=2098 loops=1)
      Sort Key: "_F2".title
      Sort Method: quicksort Memory: 336kB
      -> Nested Loop (cost=2604.52..8677.77 rows=10 width=59) (actual time=1043.033..11752.453 rows=2098 loops=1)
      -> Nested Loop (cost=2604.52..8675.90 rows=1 width=92) (actual time=1042.989..11739.154 rows=747 loops=1)
      -> Nested Loop Left Join (cost=2604.52..8667.06 rows=1 width=133) (actual time=1042.959..11725.676 rows=747 loops=1)
      -> Nested Loop (cost=2604.52..8661.97 rows=1 width=111) (actual time=1042.922..11712.643 rows=747 loops=1)
      -> Nested Loop (cost=2604.52..8636.52 rows=3 width=74) (actual time=1042.889..11696.640 rows=859 loops=1)
      -> HashAggregate (cost=2604.52..2611.35 rows=683 width=37) (actual time=1040.523..1313.283 rows=684007 loops=1)
      -> Bitmap Heap Scan on ancestors (cost=141.29..2602.82 rows=683 width=37) (actual time=211.411..487.830 rows=684007 loops=1)
      Recheck Cond: ('

      Unknown macro: {3e7799f9-1eb9-46d8-945c-25f2f96f40fe}

      '::character varying[] <@ ancestors)
      -> Bitmap Index Scan on ancestors_ancestors_idx (cost=0.00..141.12 rows=683 width=0) (actual time=196.167..196.167 rows=684007 loops=1)
      Index Cond: ('

      '::character varying[] <@ ancestors)
      -> Index Scan using hierarchy_pk on hierarchy (cost=0.00..8.81 rows=1 width=37) (actual time=0.015..0.015 rows=0 loops=684007)
      Index Cond: ((id)::text = (ancestors.id)::text)
      Filter: ((primarytype)::text = ANY ('

      Unknown macro: {SocialDomain,Domain,Workspace,SocialWorkspace,Department}

      '::text[]))
      -> Index Scan using misc_pk on misc "_F1" (cost=0.00..8.47 rows=1 width=37) (actual time=0.017..0.017 rows=1 loops=859)
      Index Cond: ((id)::text = (hierarchy.id)::text)
      Filter: ((lifecyclestate)::text <> 'deleted'::text)
      -> Index Scan using dublincore_pk on dublincore "_F2" (cost=0.00..5.07 rows=1 width=59) (actual time=0.016..0.016 rows=1 loops=747)
      Index Cond: ((hierarchy.id)::text = (id)::text)
      -> Index Scan using hierarchy_read_acl_id_idx on hierarchy_read_acl "_RACL" (cost=0.00..8.83 rows=1 width=70) (actual time=0.016..0.017 rows=1 loops=747)
      Index Cond: ((id)::text = (hierarchy.id)::text)
      -> Index Scan using aclr_user_map_acl_id_idx on aclr_user_map "_ACLRUSERMAP" (cost=0.00..1.24 rows=50 width=33) (actual time=0.013..0.016 rows=3 loops=747)
      Index Cond: ((acl_id)::text = ("_RACL".acl_id)::text)
      Total runtime: 11760.063 ms

      It can be improved by relacing in the getInTreeSql method the following line:
      "EXISTS(SELECT 1 FROM ancestors WHERE id = %s AND ARRAY[?]%s <@ ancestors)"
      by
      "EXISTS(SELECT 1 FROM ancestors WHERE id = %s AND ARRAY[?]%s <@ (VALUES(ancestors)))"

      The difference of the execution time is impressive:

      Sort (cost=62628.16..62628.19 rows=10 width=59) (actual time=418.301..418.446 rows=2098 loops=1)
      Sort Key: "_F2".title
      Sort Method: quicksort Memory: 336kB
      -> Nested Loop (cost=25433.32..62628.00 rows=10 width=59) (actual time=4.026..406.497 rows=2098 loops=1)
      -> Nested Loop (cost=25433.32..62626.13 rows=1 width=92) (actual time=3.749..386.964 rows=747 loops=1)
      -> Nested Loop Left Join (cost=25433.32..62617.29 rows=1 width=133) (actual time=3.713..367.067 rows=747 loops=1)
      -> Nested Loop Semi Join (cost=25433.32..62612.21 rows=1 width=111) (actual time=3.686..347.317 rows=747 loops=1)
      -> Hash Join (cost=25433.32..45773.65 rows=2206 width=74) (actual time=3.632..323.081 rows=747 loops=1)
      Hash Cond: (("_F1".id)::text = (hierarchy.id)::text)
      -> Seq Scan on misc "_F1" (cost=0.00..16331.58 rows=455622 width=37) (actual time=0.022..220.485 rows=455162 loops=1)
      Filter: ((lifecyclestate)::text <> 'deleted'::text)
      -> Hash (cost=25294.36..25294.36 rows=11117 width=37) (actual time=3.596..3.596 rows=859 loops=1)
      Buckets: 2048 Batches: 1 Memory Usage: 58kB
      -> Bitmap Heap Scan on hierarchy (cost=227.66..25294.36 rows=11117 width=37) (actual time=0.486..3.117 rows=859 loops=1)
      Recheck Cond: ((primarytype)::text = ANY ('

      Unknown macro: {SocialDomain,Domain,Workspace,SocialWorkspace,Department}

      '::text[]))
      -> Bitmap Index Scan on hierarchy_primarytype_idx (cost=0.00..224.88 rows=11117 width=0) (actual time=0.379..0.379 rows=859 loops=1)
      Index Cond: ((primarytype)::text = ANY ('

      '::text[]))
      -> Index Scan using ancestors_id_idx on ancestors (cost=0.00..7.63 rows=1 width=37) (actual time=0.031..0.031 rows=1 loops=747)
      Index Cond: ((id)::text = (hierarchy.id)::text)
      Filter: ('

      Unknown macro: {3e7799f9-1eb9-46d8-945c-25f2f96f40fe}

      '::character varying[] <@ (SubPlan 1))
      SubPlan 1
      -> Values Scan on "VALUES" (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.001 rows=1 loops=747)
      -> Index Scan using dublincore_pk on dublincore "_F2" (cost=0.00..5.07 rows=1 width=59) (actual time=0.025..0.025 rows=1 loops=747)
      Index Cond: ((hierarchy.id)::text = (id)::text)
      -> Index Scan using hierarchy_read_acl_id_idx on hierarchy_read_acl "_RACL" (cost=0.00..8.83 rows=1 width=70) (actual time=0.025..0.025 rows=1 loops=747)
      Index Cond: ((id)::text = (hierarchy.id)::text)
      -> Index Scan using aclr_user_map_acl_id_idx on aclr_user_map "_ACLRUSERMAP" (cost=0.00..1.24 rows=50 width=33) (actual time=0.019..0.025 rows=3 loops=747)
      Index Cond: ((acl_id)::text = ("_RACL".acl_id)::text)
      Total runtime: 418.818 ms

      This optimization is based on: https://wwwji.datadoghq.com/2013/08/100x-faster-postgres-performance-by-changing-1-line/

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: