Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-32811

Scroll on PostgreSQL doesn't use a cursor based ResultSet

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2025.x, 2021.58, 2023.16
    • Component/s: Core VCS

      Description

      To be able to retrieve large result set from SQL backend we need a cursor based ResultSet, this approach allows data retrieval in chunks, preventing the loading of all rows into memory at once.

       To do this we need to follow these requirements as explain in the PostgreSQL documentation:

      • the connection must be using the V3 protocol (this is the default >= pg 7.4 and seen in stack trace org.postgresql.core.v3)
      • the connection must not be in autocommit mode (Nuxeo raises an exception when in autocommit mode)
      • the statement must be TYPE_FORWARD_ONLY (this flag is set by Nuxeo)

      Nuxeo is doing the following:

      PreparedStatement ps = connection.prepareStatement(q.selectInfo.sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.HOLD_CURSORS_OVER_COMMIT);
      ps.setFetchSize(batchSize);
      

      Unfortunately it seems that the first executeQuery call is returning all rows (tuples) at the PostgreSQL driver level,
      even if the call returns only the expected fetch size portion to Nuxeo.

      bulk/scrollerPool-00,in:0,inCheckpoint:0,out:0,lastRead:1721667132192,lastTimer:0,wm:0,loop:55110,rebalance assigned
        at org.postgresql.core.PGStream.receiveTupleV3()Lorg/postgresql/core/Tuple; (PGStream.java:619)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(Lorg/postgresql/core/ResultHandler;IZ)V (QueryExecutorImpl.java:2379)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(Lorg/postgresql/core/Query;Lorg/postgresql/core/ParameterList;Lorg/postgresql/core/ResultHandler;IIIZ)V (QueryExecutorImpl.java:371)
        at org.postgresql.jdbc.PgStatement.executeInternal(Lorg/postgresql/core/CachedQuery;Lorg/postgresql/core/ParameterList;I)V (PgStatement.java:502)
        at org.postgresql.jdbc.PgStatement.execute(Lorg/postgresql/core/CachedQuery;Lorg/postgresql/core/ParameterList;I)V (PgStatement.java:419)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(I)Z (PgPreparedStatement.java:194)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery()Ljava/sql/ResultSet; (PgPreparedStatement.java:137)
        at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery()Ljava/sql/ResultSet; (DelegatingPreparedStatement.java:122)
        at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery()Ljava/sql/ResultSet; (DelegatingPreparedStatement.java:122)
        at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapper.scrollSearch(Ljava/lang/String;Lorg/nuxeo/ecm/core/query/QueryFilter;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (JDBCMapper.java:656)
        at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapper.scroll(Ljava/lang/String;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (JDBCMapper.java:612)
        at org.nuxeo.ecm.core.storage.sql.SoftRefCachingMapper.scroll(Ljava/lang/String;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (SoftRefCachingMapper.java:53)
        at org.nuxeo.ecm.core.storage.sql.SessionImpl.scroll(Ljava/lang/String;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (SessionImpl.java:658)
        at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.scroll(Ljava/lang/String;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (SQLSession.java:150)
        at org.nuxeo.ecm.core.api.AbstractSession.scroll(Ljava/lang/String;II)Lorg/nuxeo/ecm/core/api/ScrollResult; (AbstractSession.java:1434)
        at org.nuxeo.ecm.core.scroll.RepositoryScroll.fetch()Z (RepositoryScroll.java:88)
        at org.nuxeo.ecm.core.scroll.RepositoryScroll.hasNext()Z (RepositoryScroll.java:81)
        at org.nuxeo.ecm.core.bulk.computation.BulkScrollerComputation.processRecord(Lorg/nuxeo/lib/stream/computation/ComputationContext;Lorg/nuxeo/lib/stream/computation/Record;)V (BulkScrollerComputation.java:189)

      then the code iterate on the cursor by page of fetchSize rows, but there is no more db roundtrip.

      This can lead to OOM (12GB of tuples for 115m of rows).

      It looks like a regression since benchmark of 1b of docs have been successful on PostgreSQL years ago.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: