File system based semaphore + rsync.
copy_guardian
implements a file system based seamaphore to limit execution of
critical operations. The default mode binds the semaphore to the script which
owns it, such that operations can be also limitted over parallel executions within
a HPC batch system.
To guard a code block to be executed not more then 3 times simultaniously using copy_guardian
:
import copy_guardian
with copy_guardian.BoundedSemaphore(3):
print("im active")
copy_guardian
uses per default a sub-folder .copy_guard_locks
in your homefolder.
You can change this folder as follows:
with copy_guardian.BoundedSemaphore(3, lock_directory="/shared/my_locks"):
print("im active")
The default timeout for acquiring a lock to enter the guarded code segment is 300s, for longer running operations you can overrun this default value:
with copy_guardian.BoundedSemaphore(3, timeout=600):
print("im active")
copy_guardian
wraps the rsync
tool to copy files between servers. To use this functionality rsync
must be
installed on all machines involved.
Further passwordless authentication must be setup using public-/private-key pairs. The
following example copies a file my_data.txt
to a folder /remote_data
on a
remote machine ssh-server.mycompany.com
using the same name on the target
machine:
c = Connection(
host="ssh-server.mycompay.com",
user="me",
private_key="./id_ed25519"
)
c.rsync_to("my_data.txt", "/remote_data")
You can also use a different port and copy multiple files using wild-cards:
c = Connection(
host="ssh-server.mycompay.com",
user="me",
private_key="./id_ed25519"
port=2222
)
c.rsync_to("./local_data/*.txt", "/remote_data")
You can also copy folders:
c = Connection(
host="ssh-server.mycompay.com",
user="me",
private_key="./id_ed25519"
port=2222
)
c.rsync_to("./local_data/", "/remote_data")
To copy from another server to the local computer the method is rsync_from
:
c = Connection(
host="ssh-server.mycompay.com",
user="me",
private_key="./id_ed25519"
port=2222
)
c.rsync_from("/remote_data/*.txt", "/local_data")
To speedup copying many files on an parallel file system, you can use copy_local_folder
:
from copy_guardian import copy_local_folder
copy_local_folder("/remote_data/results", "/local_data")
If you have any suggestions or questions about copy_guardian feel free to email me at uwe.schmitt@id.ethz.ch.
If you encounter any errors or problems with copy_guardian, please let me know!