Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-22597

Support Avro serialization for Nuxeo Log/Stream

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.2
    • Component/s: Streams
    • Release Notes Description:
      Hide

      Nuxeo stream can now encode record with different codec:

      • legacy: the original format based on java Externalizable
      • avro: avro message with a schema fingerprint header (Nuxeo has an avro ShemaStore service to retrieve schemas).
      • avroBinary: avro message without schema header so more compact
      • avroJson: avro in Json for debug purpose only

      You can choose the encoding for the different service using nuxeo.conf options:

      nuxeo.stream.work.log.codec=legacy
      nuxeo.stream.audit.log.codec=legacy
      nuxeo.stream.pubsub.log.codec=avroBinary
      

      Note that you should not change the codec of an existing stream (Kafka Topic or Chronicle file), this should be done only on new stream.

      Show
      Nuxeo stream can now encode record with different codec: legacy : the original format based on java Externalizable avro : avro message with a schema fingerprint header (Nuxeo has an avro ShemaStore service to retrieve schemas). avroBinary : avro message without schema header so more compact avroJson : avro in Json for debug purpose only You can choose the encoding for the different service using nuxeo.conf options: nuxeo.stream.work.log.codec=legacy nuxeo.stream.audit.log.codec=legacy nuxeo.stream.pubsub.log.codec=avroBinary Note that you should not change the codec of an existing stream (Kafka Topic or Chronicle file), this should be done only on new stream.
    • Sprint:
      nxcore 10.1.1, nxcore 10.1.2, nxcore 10.1.4, nxcore 10.1.5, nxcore 10.2.2, nxcore 10.2.1, nxcore 10.2.6, nxcore 10.2.7
    • Story Points:
      3

      Description

      Log relies on Externalizable record serialization, this is much better than serializable but still it is slow and there are lots of extra data dumped (like the class and serializable uid) for each object.

      Avro should be more compact and maintenable. The gain for small records (like invalidation or consumer offsets) can be huge.

      A fist step is to see how to use it at the record level.

      Also it comes with tools to read it so it will be easier to dump a chronicle queue for instance.

      This will also make the nuxeo-stream interoperable with non Java processor.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bdelbosc Benoit Delbosc
                Reporter:
                bdelbosc Benoit Delbosc
                Participants:
                Reviewers:
                Florent Guillaume, Pierre Gautier
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 weeks, 1 day, 3 hours
                  2w 1d 3h