-
Type: Improvement
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 7.4
-
Fix Version/s: 7.10
-
Component/s: Redis, TransientStore
-
Tags:
-
Sprint:drive-7.10-1, drive-7.10-2
-
Story Points:3
StorageEntry storage: previous state
Previously the StorageEntry was stored inside 2 Caches depending on their lifecycle.
Because it was using the Cache API, the StorageEntry was stored as a whole as a single serialized Java object.
So we had something like:
cache:storageId1: <serializeddata> cache:storageId2: <serializeddata>
This was creating several issues:
- Double init of an entry and dirty updates
- Back & forth serialization
- Poor lisibility
Improvements
We have reviewed the RedisTransientStore to bypass the Cache usage.
The idea was to use a native Redis data structure that:
- Manages concurrency and atomicity.
- Limits serialization usage.
This fixes the concurrency issue faced in the BatchManager and previously handled by synchronized blocks.
From a JSON perspective a storage entry can be seen as:
{
“blobs” : [{…}, {…}, ],
“params” : {…},
“completed” : false,
“size” : 123
}
We wanted to keep this structure inside Redis, but since we could not nest the hashes, we needed to flatten it.
# Store the main map > hmset ts:storeId:entryKey blobCount 2 completed false size 123 ts:storeId:entryKey { "blobCount" : 2, "completed" : false, "size" : 123 } # Store the params map > hmset ts:storeId:entryKey:params p1 "value1" p2 "java:serializeddata" ts:storeId:entryKey:params { "p1" : "value1", "p2" : "java:serializeddata" } # Store the first blob > hmset ts:storeId:entryKey:blobs:0 file "relative/path/to/file" filename "xxx" mimetype "yyy" encoding "zzz" digest "ddd" ts:storeId:entryKey:blobs:0 { "file" : "relative/path/to/file" "filename" : "xxx", "mimetype" : "yyy", "encoding" : "zzz", "digest" : "ddd", } # Store the second blob ...