[NXP-24335] BAF: Resilient Bulk Actions Framework - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Epic
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 10.2
Component/s: Core

Tags:
- 10.2-checked
- nxFG
- nxcore

Description

Review the architecture of how processing on documents is performed so as to get the platform more robust to repository actions whatever number of documents is involved, and to allow asynchronous design for such actions. Some of the challenges are:

ability to provide a completeness status
manage the “ eventually consistent” aspect of such a design, especially so as to be able to alert a user that a given document might be affected by an asynchronous processing action
Handle errors and error report
One constraint is that we want all this to happen at the repository level, i.e we want to make sure that anything happening on any document will correctly fire a related repository event. i.e we do not want to have things happening at the database level silently.

One of the expected gains is to be able to provide paralleled computing for those bulk changes, and, longer term to even provide required elasticity to Finnish a goal, depending on how many documents are concerned.

Typical processing that would benefit from such architectural change:

ACL Updates
Lifecycle changes
Path changes (Move)
Deletion of large amount of document (>100k docs)
Applying an automation chain (or single operation) robustly across a large set of documents. (Setting a property (inheritance))

User stories all mostly focus on REST API interaction, but of course the Java API should provide the relevant signatures to manage this in all related services.

Create the set of documents to put in a stream. You get a key, and you know when the set is built. It is then possible to know also the offset of the records that need to be processed.
Then you define the action (to know if can run concurrently)
Running a processor: set of computation, doing batching, reading doc, creating exports, ... . Done as a stream processor. Can handle fail over, can redistribute to other nodes, ... . Can have concurrent processors running in parallel.
Provide computation statuses
Job is done, can provide a specific status, specific ID stored in a cluster wide key value store, and is persistent.

Part 1, BAF 1: Generate a set of Documents for bulk - NXP-24335

As a user I would like to be able to create a document set so that I can be able to run a bulk action on it.

Definition of Done:

Creation of BulkService
Creation of REST API on top of service
Service can consume NXQL queries
A bulkActionId can be retrieved on bulk action (instance) creation
The document set creation status can be checked (obtained from bulkActionId)

Acceptance Criteria:

I can make a REST request to create/initiate my DocumentSet
I can make a REST request to check my DocumentSet initialisation status

Part 2 BAF 2: Execute a Bulk action on a document set - NXP-24619

As a user I would like to be able to execute a Bulk action on a document set.

Definition of Done:

Bulk service consumes a BulkCommand containing necessary information to build the document set and containing the action to run
The Bulk action status and progression can be checked

Acceptance Criteria:

I can make a REST request to run a Bulk action
I can make a REST request to check a Bulk action status and progression
I can make a REST request to pause or resume an existing/running Bulk action.

Attachments

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Attachments

BOF.pdf
11 kB
2018-06-07 16:04

Issue Links

is related to

NXP-23788 Optimize Read ACL update on big volume

Resolved

NXP-21839 Migration Service

Resolved

NXP-22575 Fix transaction timeout when paginate getChildren calls in BulkLifeCycleChangeListener

Resolved

NXP-24031 Dedicated "trashed" boolean

Resolved

NXP-25301 BAF: Use Avro for all serialization

Resolved

ELEMENTS-777 BAF: Add/Evolve core elements to leverage Bulk Action Framework

Resolved

NXP-22110 Provides an Elasticsearch indexing impl with Computations

Open

is required by

NXP-12800 trash service delete/purge database conflicts

Open

NXP-24621 New Nuxeo Stream Importer

Open

NXP-24237 CSV Export

Resolved

NXP-24619 BAF: Usage for document trash flow

Resolved

links to

Some ideas about bulk operation

(2 is related to, 4 is required by, 1 links to)

Activity

People

Assignee:

Unassigned

Reporter:

Alain Escaffre

Participants:

Alain Escaffre, Anahide Tchertchian, Kevin Leturc

Votes:

0 Vote for this issue

Watchers:

6 Start watching this issue

Dates

Created:

2018-02-01 11:05

Updated:

2018-10-11 15:54

Resolved:

2018-07-17 10:12