XMLWordPrintable

    Details

    • Type: Sub-task
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: QualifiedToSchedule
    • Component/s: Importer

      Description

      Directory Tree and Threading

      The default importer is targeting a simple use case: import a complete filesystem tree inside a Nuxeo repository.

      On most computers you have several CPUs and several cores: this means you can import more documents per second by using several threads.

      However, when importing a tree, threading must be considered carefully:

      • Each thread will be associated with a Transaction (remember we import several documents before doing a commit),
      • Each transaction is isolated from others (MVCC mode).

      This means that a new thread must be created only when a new branch will be accessible inside the source filesystem. At least, the default ImporterThreadingPolicy (DefaultMultiThreadingPolicy) does that.

      As a result, if you import a big folder with a flat structure, you will only have one importer thread, even if you configure to allow more.

      To be sure to be able to leverage multi-threading, you can either:
      Ensure the source filesystem is a tree with at least two levels,

      Change the importer threading policy.

      Flat folder importer

      To make an efficient Flat folder importer we need to change the way the importer walk the filesystem and allocate threads.

      The target structure should something like :

      • 1 reader thread
        • does the getChildren in a lazy way (the default File.listfile won't work)
        • may be use java 7 nio treewalker
        • push files (or just path) to be imported in a queue
      • 1 queue
        • stores path of files to be imported
      • a ThreadPool with n threads that
        • consume the queue

        Attachments

          Activity

            People

            • Assignee:
              tdelprat Thierry Delprat
              Reporter:
              tdelprat Thierry Delprat
              Participants:
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: