Quick Start

Using archivetar on ARC clusters directly to Data Den

Archivetar has reasonable defaults when used on ARC systems. Specifically, the values --source is set to the current cluster and --destination is Non-Sensitive Data Den or Sensitive Data Den if on a PHI system.  They can be set to override.

module load archivetar
cd /nfs/turbo/<volume>/<toarchive>/
archivetar --prefix my-prefix --destination-dir /datadenvolume/<target-dataden-folder>/

Options used to address common issues are:

  • --rm-at-files   Delete tars on cluster after uploading to Data Den. Useful to avoid filling local storage
  • --bundle-dir <path>  Create the local tars in an alternative location. Useful if the space being archived is full, commonly pointed at /scratch/

Specify small file cutoff and size before creating a new tar

Setting a file size cutoff with --size makes archives slightly messier but often saves significant time.

# the tar size is a minimum, so tars may be much larger than listed here. The
# size is also the size before compression
archivetar --prefix myarchive --size 20G --tar-size 10G

Expand archived directory

 unarchivetar --prefix project1

Upload via Globus to Archive

archivetar --prefix project1 --source <globus UUID> 
 --destination <globus UUID> --destination-path <path on archive>