Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-18200

Handle DocumentModel Serialization in a consistent way

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: QualifiedToSchedule
    • Component/s: Core, Core IO

      Description

      ShallowDocument Model

      Use case

      The idea behind ShallowDocument model (and ShallowEvent) is to avoid storing in memory or serialize Document related data :

      • that may use too much memory
      • that may be out of date when the Document Model is actually used

      The ShallowDocumentModel approach only works if all data inside the DocumentModel has been flushed so this may not works for all cases.

      Problem

      The problem comes from the fact that the ShallowDocumentModel still contains the ContextMap and that this ContextMap may contain non serializable objects.

      Basically, the ShallowDocument should be built in a recursive way: i.e. replace the DocumentModel stored in the ContextMap with ShallowDocumentModel

      DocumentModel serialization

      Java Serialization

      The DocumentModel is marked as serializable, but actually it is not.

      We need to fix that and there are several ways to do it.

      We already have a serialization that we test and maintain the JSON one.

      So, even if technically we need to have a Java Serialization, if we take over the Java Serialization (by implemeting the `readObject` and `writeObject` method), we should then rely on a simple JSON string for the main DocumentModel attribute.

      I see several advantages to this approach :

      • maintenance: so that we do not forget to handle one new fields we add to the DocumentModel
      • introspection: see JSON string (even from within a java Serialized object) is probably easier in Redis
      Connection

      I guess the real problem is about properly reconnect the DocumentModel to the session: i.e. to the right session and with the right security context.

      When we are within the scope of a transaction, we already have a thread-local that points to a CoreSession: so we should be able to reconnect transparently.

      If we want to allow desirialization in other cases, I guess the only option is to have an explicit desirialization method that take the CoreSession as parameter.

      Do we need the 2 use cases ?

      I would say that the 2 use cases are interesting :

      • ShallowDocumentModel
      • for asynchronous jobs that anyway want to take data from the repository when they run
      • for asynchronous jobs that need to have access to some info (like path) even if the Document does not exist anymore
      • Serialization : because the DocumentModel is supposed to be serializable !

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: