-
Type: Improvement
-
Status: Resolved
-
Priority: Critical
-
Resolution: Won't Fix
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: Git Access
-
Roadmap Milestone:NOS.infra.LTS2019
-
Tags:
-
Team:NOS
In the current deployment, each node has his own Workspace (project git clone) cache.
Each node has his own invalidation mechanism based on a poll that maintains access and cleanup depending on the usage.
It works well when all the work is done on the same node, but as we are moving to more and more to a services-based architecture, we'll see more request hitting several nodes which increases risk of hitting a node with an outdated cache.
Observations:
- Git project can contain assets; which makes the clone costly.
- Current workspace pool has a "dirty" state
Proposed Solutions:
- Using cluster wise event/cache invalidation (codebase side) only relies on pushed changes
In order to propagate workspace pool invalidation, we could rely on our PubSub Event that sends an invalidation event to all the nodes.
It would still have drawbacks around any not pushed commits.
- Shared Invalidation states with Halzecast relies on a infra update
Instead of propagate invalidations using pub sub, we could as well keeping track of the invalidation in a cluster shared object using Hazelcast.
But it would need some infra upgrade to have hazelcast setup.
- Request rooting (network) hard to implement
Having some kind of session affinity for workspace access, that keeps track of which node has the current in-use workspace clone.
- Shared InMemory Repository Clone (codebase side) way too experimental IMHO and repos in memory don't sound good
jGit has a special clone mode (https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/dfs/InMemoryRepository.java#L28) that would allow us to keep repository only in memory. With the help of Halzelcast to share objects in a cluster, we could easily have a cluster wise workspace pool and all the RW access would be done on the same object.
It might be a possible solution to evaluate with a more aggressive cleanup strategy, some tests to see how it impacts performance and what would be the memory footprint with our load with concurrent users.
- is related to
-
NXS-4558 NOS: Migration to Nuxeo 10.10
- Resolved