[NXP-24605] Create the Framework for Document Enrichment Service Integration - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Epic
Status: Resolved
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: 10.3
Component/s: Core, Nuxeo Vision

Release Notes Summary:
Defines an infrastructure for Document Enrichment with service integration. The new framework is architecture to be resilient, scalable and efficient as flexible on the functional level.
Tags:
- nxAI

Description

AI Service integration Framework
The idea is to have an AI integration framework that allows to:

extract Data from Nuxeo Document
- picture, text content, filename, mime type, filing plan …
- extract audio track or frames from images …
call an external AI service to compute external meta-data
- classification, entities extraction, text transcription …
store the additional meta-data
- facet/schema

This system needs to support 2 processing modes:

on the fly / event based
- create/modify event => call AI enrichment
batch
- batch call AI enrichment based on a query

In addition, we will need to be able to execute run the AI calls in 2 modes:

execute: call the AI services to get the result
learn mode: call AI service to make it learn
- event-based for user-initiated update of the inferred schemas
- batch mode for initial training (may include data export and staging)

The existing Google-Vision/AWS-Rekognition could be used as startup code:

it already contains part of the infrastructure
GoogleVision and Rekognition are 2 valid services to integrate

We may want to start the work as a clone/fork or the nuxeo-vision repository in order to avoid any confusion (here the goal is to reuse existing code to speed up the bootstrap process, but the target scope is much wider than nuxeo-vision)

We want to build a generic infrastructure so that:

we can easily add new services
- build a pluggability model to
  - extract data from the document
  - call AI service in classify or learn mode
  - store result
we can route/dispatch between different AI services
- depending on event, doc type and mime-type

Services to integrate

Standard Image recognition
- Google Vision and Rekognition
Objects detection, Faces recognition
- here we could also leverage Google an AWS services
Specialized Image recognition
- ProductAI (we have a 2 weeks trial)
Speech to Text
- AWS transcribe?
Text based categorization
- use AWS Comprehend or build our own service based on AWS SageMaker

In terms of short term target, improving GoogleVision/AWS Rekognition and ProductAI seems like easy options.

Attachments

Activity

People

Assignee:

Unassigned

Reporter:

Pedro Cardoso

Participants:

Pedro Cardoso

Votes:

0 Vote for this issue

Watchers:

2 Start watching this issue

Dates

Created:

2018-03-12 15:23

Updated:

2018-12-11 12:56

Resolved:

2018-12-11 09:40