Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-24605

Create the Framework for Document Enrichment Service Integration

    XMLWordPrintable

    Details

    • Type: Epic
    • Status: Resolved
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 10.3
    • Component/s: Core, Nuxeo Vision
    • Release Notes Summary:
      Defines an infrastructure for Document Enrichment with service integration. The new framework is architecture to be resilient, scalable and efficient as flexible on the functional level.
    • Tags:

      Description

      AI Service integration Framework
      The idea is to have an AI integration framework that allows to:

      • extract Data from Nuxeo Document
        • picture, text content, filename, mime type, filing plan …
        • extract audio track or frames from images …
      • call an external AI service to compute external meta-data
        • classification, entities extraction, text transcription …
      • store the additional meta-data
        • facet/schema

      This system needs to support 2 processing modes:

      • on the fly / event based
        • create/modify event => call AI enrichment
      • batch
        • batch call AI enrichment based on a query

      In addition, we will need to be able to execute run the AI calls in 2 modes:

      • execute: call the AI services to get the result
      • learn mode: call AI service to make it learn
        • event-based for user-initiated update of the inferred schemas
        • batch mode for initial training (may include data export and staging)

      The existing Google-Vision/AWS-Rekognition could be used as startup code:

      • it already contains part of the infrastructure
      • GoogleVision and Rekognition are 2 valid services to integrate

      We may want to start the work as a clone/fork or the nuxeo-vision repository in order to avoid any confusion (here the goal is to reuse existing code to speed up the bootstrap process, but the target scope is much wider than nuxeo-vision)

      We want to build a generic infrastructure so that:

      • we can easily add new services
        • build a pluggability model to
          • extract data from the document
          • call AI service in classify or learn mode
          • store result
      • we can route/dispatch between different AI services
        • depending on event, doc type and mime-type

      Services to integrate

      • Standard Image recognition
        • Google Vision and Rekognition
      • Objects detection, Faces recognition
        • here we could also leverage Google an AWS services
      • Specialized Image recognition
        • ProductAI (we have a 2 weeks trial)
      • Speech to Text
        • AWS transcribe?
      • Text based categorization
        • use AWS Comprehend or build our own service based on AWS SageMaker

      In terms of short term target, improving GoogleVision/AWS Rekognition and ProductAI seems like easy options.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              pcardoso Pedro Cardoso
              Participants:
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: