Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
A
agkyra
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
itminedu
agkyra
Commits
6bdb3768
Commit
6bdb3768
authored
9 years ago
by
Giorgos Korfiatis
Browse files
Options
Downloads
Patches
Plain Diff
describe algorithm
parent
1551511b
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
docs/algorithm.rst
+105
-0
105 additions, 0 deletions
docs/algorithm.rst
docs/algorithm.txt
+0
-60
0 additions, 60 deletions
docs/algorithm.txt
with
105 additions
and
60 deletions
docs/algorithm.rst
+
105
−
0
View file @
6bdb3768
...
...
@@ -71,3 +71,108 @@ object is unstaged and, if syncing has succeeded, the target client calls
back the syncer to acknowledge the new sync state. The target and SYNC
archive entries for the object are updated to contain the new serial and
object metadata.
Avoiding overriding of unsynced objects
---------------------------------------
Object upload and download is facilitated by the kamaki library, which
implements the object-storage API. When uploading an object or deleting an
upstream one, the action is dependent on the last known upstream version
(etag) of the object. In case a new version is found upstream, the action
fails, in order not to override the new unsynced upstream version. Likewise,
when applying a downloaded object to the file system, a combination of file
system operations is used (first moving the existing file to a temporary
'hide' location and then linking the downloaded file back to the original
location).
Handling conflicts
------------------
If an object has changed in both archives since the last syncing, then there
is a conflict situation. The conflict is resolved by choosing to sync the
upstream version, while maintaining the local copy with a new name that
indicates the conflict.
Recovering from errors
----------------------
A syncing may fail for various reasons, for instance due to a connection or
a file system error. In this case, the stored DECISION serial will differ
from the SYNC serial, indicating that syncing has not completed. The
decision process will detect it and resume the syncing. However, since the
decision process is run often, it will also detect syncings that have not
completed because they are just still running. In order to distinguish and
exclude them, we keep in memory a 'heartbeat' entry for each active syncing
-- while an object is being transferred, the tranferring client is
responsible to keep the heartbeat up-to-date.
When syncing an upstream object to the local file system, there is a risk to
lose the local object if syncing fails after the object has been hidden, as
explained above. In order to address this, we record the file move to the
database, so that we can recover the file when syncing resumes.
Algorithm sketch
================
::
archive serial: last recorded version of an object in an archive
sync serial: the last synced version
decision serial: specifies which version is being / should be synced
failed serials: marks failed syncs to avoid replay (eg collision in upload)
(in runtime -- in a new process replay will fail again)
heartbeat: blocks probe while syncing, but candidates are kept for later
blocks new sync action while syncing
probe_file:
if recent heartbeat for object found:
abort (object is being synced)
if archive serial != sync serial:
abort (already probed)
if object changed:
get new info
update object state
commit
decide_file_sync:
if recent heartbeat found with different id:
abort (already syncing)
if previous decision serial found:
use decision unless serial marked as failed
make decision with priority to master
add object/current id in heartbeat
commit
sync_file (in new thread)
sync_file:
source handle <- stage source object
target pull file from source handle
call back ack_file_sync (or mark_as_failed on failure)
ack_file_sync (synced source state, synced target state):
update source state
update target state using source serial
update sync state (merging source & target info) using source serial
set decision state equal to sync state
commit
remove object from heartbeat
mark_as_failed:
remove object from heartbeat
include (serial, file) in failed serials
main loop:
for every archive, probe candidate files
for every file with updated serial, decide sync
This diff is collapsed.
Click to expand it.
docs/algorithm.txt
deleted
100644 → 0
+
0
−
60
View file @
1551511b
archive serial: last found version of a file
sync serial: the last synced version
decision serial: specifies which archive is being / should be synced
failed serials: marks failed syncs to avoid replay (eg collision in upload)
in runtime -- in a new process replay will fail again
heartbeat: blocks probe while syncing, but candidates are kept for later
blocks new sync action while syncing
probe_file:
if active heartbeat for file found:
abort (file is being synced)
if archive serial != sync serial:
abort (already probed)
if file changed:
get new info
update file state
commit
decide_file_sync:
if active heartbeat found with different id:
abort
if previous decision serial found:
use decision unless serial marked as failed
make decision with priority to master
add file/current id in heartbeat
commit
sync file (in new thread)
sync_file:
source handle <- stage source file
target pull file from source handle
call back ack_file_sync
ack_file_sync (synced source state, synced target state):
update source state
update target state using source serial
update sync state (merging source & target info) using source serial
set decision state equal to sync state
commit
remove file from heartbeat
mark_as_failed:
remove file from heartbeat
include (serial, file) in failed serials
main loop:
for every archive probe candidate files
for every file with updated serial, decide sync
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment