borg-create

BORG-CREATE(1) borg backup tool BORG-CREATE(1)

NAME

   borg-create - Create new archive

SYNOPSIS

   borg [common options] create [options] ARCHIVE [PATH...]

DESCRIPTION

   This command creates a backup archive containing all files found while recursively traversing all paths specified. Paths are added to the archive as they are given, that means if rel‐
   ative paths are desired, the command has to be run from the correct directory.

   The slashdot hack in paths (recursion roots) is triggered by using /./: /this/gets/stripped/./this/gets/archived means to process that fs object, but strip the prefix on the left side
   of ./ from the archived items (in this case, this/gets/archived will be the path in the archived item).

   When  giving  '-'  as  path, borg will read data from standard input and create a file 'stdin' in the created archive from that data. In some cases it's more appropriate to use --con‐
   tent-from-command, however. See section Reading from stdin below for details.

   The archive will consume almost no disk space for files or parts of files that have already been stored in other archives.

   The archive name needs to be unique. It must not end in '.checkpoint' or '.checkpoint.N' (with N being a number), because these names are used for checkpoints and treated  in  special
   ways.

   In the archive name, you may use the following placeholders: {now}, {utcnow}, {fqdn}, {hostname}, {user} and some others.

   Backup  speed  is  increased by not reprocessing files that are already part of existing archives and weren't modified. The detection of unmodified files is done by comparing multiple
   file metadata values with previous values kept in the files cache.

   This comparison can operate in different modes as given by --files-cache:

   • ctime,size,inode (default)

   • mtime,size,inode (default behaviour of borg versions older than 1.1.0rc4)

   • ctime,size (ignore the inode number)

   • mtime,size (ignore the inode number)

   • rechunk,ctime (all files are considered modified - rechunk, cache ctime)

   • rechunk,mtime (all files are considered modified - rechunk, cache mtime)

   • disabled (disable the files cache, all files considered modified - rechunk)

   inode number: better safety, but often unstable on network filesystems

   Normally, detecting file modifications will take inode information into consideration to improve the reliability of file change detection.  This is problematic for  files  located  on
   sshfs  and  similar  network file systems which do not provide stable inode numbers, such files will always be considered modified. You can use modes without inode in this case to im‐
   prove performance, but reliability of change detection might be reduced.

   ctime vs. mtime: safety vs. speed

   • ctime is a rather safe way to detect changes to a file (metadata and contents) as it can not be set from userspace. But, a metadata-only change will already  update  the  ctime,  so
     there  might  be  some  unnecessary  chunking/hashing  even without content changes. Some filesystems do not support ctime (change time).  E.g. doing a chown or chmod to a file will
     change its ctime.

   • mtime usually works and only updates if file contents were changed. But mtime can be arbitrarily set from userspace, e.g. to set mtime back to the same value it had before a content
     change happened. This can be used maliciously as well as well-meant, but in both cases mtime based cache modes can be problematic.

   The mount points of filesystems or filesystem snapshots should be the same for every creation of a new archive to ensure fast operation. This is because the file cache that is used to
   determine changed files quickly uses absolute filenames.  If this is not possible, consider creating a bind mount to a stable location.

   The --progress option shows (from left to right) Original, Compressed and Deduplicated (O, C and D, respectively), then the Number of files (N) processed so far, followed by the  cur‐
   rently processed path.

   When  using  --stats,  you  will get some statistics about how much data was added - the "This Archive" deduplicated size there is most interesting as that is how much your repository
   will grow. Please note that the "All archives" stats refer to the state after creation. Also, the --stats and --dry-run options are mutually exclusive because the data is not actually
   compressed and deduplicated during a dry run.

   For more help on include/exclude patterns, see the borg_patterns command output.

   For more help on placeholders, see the borg_placeholders command output.

OPTIONS

   See borg-common(1) for common options of Borg commands.

arguments

   ARCHIVE
          name of archive to create (must be also a valid directory name)

   PATH   paths to archive

optional arguments

   -n, --dry-run
          do not create a backup archive

   -s, --stats
          print statistics for the created archive

   --list output verbose list of items (files, dirs, ...)

   --filter STATUSCHARS
          only display items with the given status characters (see description)

   --json output stats as JSON. Implies --stats.

   --no-cache-sync
          experimental: do not synchronize the cache. Implies not using the files cache.

   --stdin-name NAME
          use NAME in archive for stdin data (default: 'stdin')

   --stdin-user USER
          set user USER in archive for stdin data (default: 'root')

   --stdin-group GROUP
          set group GROUP in archive for stdin data (default: 'wheel')

   --stdin-mode M
          set mode to M in archive for stdin data (default: 0660)

   --content-from-command
          interpret PATH as command and store its stdout. See also section Reading from stdin below.

   --paths-from-stdin
          read DELIM-separated list of paths to backup from stdin. All control is external: it will back up all files given - no more, no less.

   --paths-from-command
          interpret PATH as command and treat its output as --paths-from-stdin

   --paths-delimiter DELIM
          set path delimiter for --paths-from-stdin and --paths-from-command (default: \n)

Include/Exclude options

   -e PATTERN, --exclude PATTERN
          exclude paths matching PATTERN

   --exclude-from EXCLUDEFILE
          read exclude patterns from EXCLUDEFILE, one per line

   --pattern PATTERN
          include/exclude paths matching PATTERN

   --patterns-from PATTERNFILE
          read include/exclude patterns from PATTERNFILE, one per line

   --exclude-caches
          exclude directories that contain a CACHEDIR.TAG file (http://www.bford.info/cachedir/spec.html)

   --exclude-if-present NAME
          exclude directories that are tagged by containing a filesystem object with the given NAME

   --keep-exclude-tags
          if tag objects are specified with --exclude-if-present, don't omit the tag objects themselves from the backup archive

   --exclude-nodump
          exclude files flagged NODUMP

Filesystem options

   -x, --one-file-system
          stay in the same file system and do not store mount points of other file systems - this might behave different from your expectations, see the description below.

   --numeric-owner
          deprecated, use --numeric-ids instead

   --numeric-ids
          only store numeric user and group identifiers

   --noatime
          do not store atime into archive

   --atime
          do store atime into archive

   --noctime
          do not store ctime into archive

   --nobirthtime
          do not store birthtime (creation date) into archive

   --nobsdflags
          deprecated, use --noflags instead

   --noflags
          do not read and store flags (e.g. NODUMP, IMMUTABLE) into archive

   --noacls
          do not read and store ACLs into archive

   --noxattrs
          do not read and store xattrs into archive

   --sparse
          detect sparse holes in input (supported only by fixed chunker)

   --files-cache MODE
          operate files cache in MODE. default: ctime,size,inode

   --read-special
          open and read block and char device files as well as FIFOs as if they were regular files. Also follows symlinks pointing to these kinds of files.

Archive options

   --comment COMMENT
          add a comment text to the archive

   --timestamp TIMESTAMP
          manually specify the archive creation date/time (UTC, yyyy-mm-ddThh:mm:ss format). Alternatively, give a reference file/directory.

   -c SECONDS, --checkpoint-interval SECONDS
          write checkpoint every SECONDS seconds (Default: 1800)

   --chunker-params PARAMS
          specify the chunker parameters (ALGO, CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE). default: buzhash,19,23,21,4095

   -C COMPRESSION, --compression COMPRESSION
          select compression algorithm, see the output of the "borg help compression" command for details.

EXAMPLES

      # Backup ~/Documents into an archive named "my-documents"
      $ borg create /path/to/repo::my-documents ~/Documents

      # same, but list all files as we process them
      $ borg create --list /path/to/repo::my-documents ~/Documents

      # Backup /mnt/disk/docs, but strip path prefix using the slashdot hack
      $ borg create /path/to/repo::docs /mnt/disk/./docs

      # Backup ~/Documents and ~/src but exclude pyc files
      $ borg create /path/to/repo::my-files \
          ~/Documents                       \
          ~/src                             \
          --exclude '*.pyc'

      # Backup home directories excluding image thumbnails (i.e. only
      # /home/<one directory>/.thumbnails is excluded, not /home/*/*/.thumbnails etc.)
      $ borg create /path/to/repo::my-files /home \
          --exclude 'sh:home/*/.thumbnails'

      # Backup the root filesystem into an archive named "root-YYYY-MM-DD"
      # use zlib compression (good, but slow) - default is lz4 (fast, low compression ratio)
      $ borg create -C zlib,6 --one-file-system /path/to/repo::root-{now:%Y-%m-%d} /

      # Backup onto a remote host ("push" style) via ssh to port 2222,
      # logging in as user "borg" and storing into /path/to/repo
      $ borg create ssh://borg@backup.example.org:2222/path/to/repo::{fqdn}-root-{now} /

      # Backup a remote host locally ("pull" style) using sshfs
      $ mkdir sshfs-mount
      $ sshfs root@example.com:/ sshfs-mount
      $ cd sshfs-mount
      $ borg create /path/to/repo::example.com-root-{now:%Y-%m-%d} .
      $ cd ..
      $ fusermount -u sshfs-mount

      # Make a big effort in fine granular deduplication (big chunk management
      # overhead, needs a lot of RAM and disk space, see formula in internals
      # docs - same parameters as borg < 1.0 or attic):
      $ borg create --chunker-params buzhash,10,23,16,4095 /path/to/repo::small /smallstuff

      # Backup a raw device (must not be active/in use/mounted at that time)
      $ borg create --read-special --chunker-params fixed,4194304 /path/to/repo::my-sdx /dev/sdX

      # Backup a sparse disk image (must not be active/in use/mounted at that time)
      $ borg create --sparse --chunker-params fixed,4194304 /path/to/repo::my-disk my-disk.raw

      # No compression (none)
      $ borg create --compression none /path/to/repo::arch ~

      # Super fast, low compression (lz4, default)
      $ borg create /path/to/repo::arch ~

      # Less fast, higher compression (zlib, N = 0..9)
      $ borg create --compression zlib,N /path/to/repo::arch ~

      # Even slower, even higher compression (lzma, N = 0..9)
      $ borg create --compression lzma,N /path/to/repo::arch ~

      # Only compress compressible data with lzma,N (N = 0..9)
      $ borg create --compression auto,lzma,N /path/to/repo::arch ~

      # Use short hostname, user name and current time in archive name
      $ borg create /path/to/repo::{hostname}-{user}-{now} ~
      # Similar, use the same datetime format that is default as of borg 1.1
      $ borg create /path/to/repo::{hostname}-{user}-{now:%Y-%m-%dT%H:%M:%S} ~
      # As above, but add nanoseconds
      $ borg create /path/to/repo::{hostname}-{user}-{now:%Y-%m-%dT%H:%M:%S.%f} ~

      # Backing up relative paths by moving into the correct directory first
      $ cd /home/user/Documents
      # The root directory of the archive will be "projectA"
      $ borg create /path/to/repo::daily-projectA-{now:%Y-%m-%d} projectA

      # Use external command to determine files to archive
      # Use --paths-from-stdin with find to only backup files less than 1MB in size
      $ find ~ -size -1000k | borg create --paths-from-stdin /path/to/repo::small-files-only
      # Use --paths-from-command with find to only backup files from a given user
      $ borg create --paths-from-command /path/to/repo::joes-files -- find /srv/samba/shared -user joe
      # Use --paths-from-stdin with --paths-delimiter (for example, for filenames with newlines in them)
      $ find ~ -size -1000k -print0 | borg create \
          --paths-from-stdin \
          --paths-delimiter "\0" \
          /path/to/repo::smallfiles-handle-newline

NOTES

   The --exclude patterns are not like tar. In tar --exclude .bundler/gems will exclude foo/.bundler/gems. In borg it will not, you need to use --exclude  '*/.bundler/gems'  to  get  the
   same effect.

   In  addition  to  using  --exclude  patterns,  it is possible to use --exclude-if-present to specify the name of a filesystem object (e.g. a file or folder name) which, when contained
   within another folder, will prevent the containing folder from being backed up.  By default, the containing folder and all of its contents will be omitted from the backup.   If,  how‐
   ever,  you  wish to only include the objects specified by --exclude-if-present in your backup, and not include any other contents of the containing folder, this can be enabled through
   using the --keep-exclude-tags option.

   The -x or --one-file-system option excludes directories, that are mountpoints (and everything in them).  It detects mountpoints by comparing the  device  number  from  the  output  of
   stat()  of the directory and its parent directory. Specifically, it excludes directories for which stat() reports a device number different from the device number of their parent.  In
   general: be aware that there are directories with device number different from their parent, which the kernel does not consider a mountpoint and also the other way around.  Linux  ex‐
   amples  for this are bind mounts (possibly same device number, but always a mountpoint) and ALL subvolumes of a btrfs (different device number from parent but not necessarily a mount‐
   point).  macOS examples are the apfs mounts of a typical macOS installation.  Therefore, when using --one-file-system, you should double-check that the backup works as intended.

Item flags

   --list outputs a list of all files, directories and other file system items it considered (no matter whether they had content changes or not). For each item, it prefixes a single-let‐
   ter flag that indicates type and/or status of the item.

   If you are interested only in a subset of that output, you can give e.g.  --filter=AME and it will only show regular files with A, M or E status (see below).

   A uppercase character represents the status of a regular file relative to the "files" cache (not relative to the repo -- this is an issue if the files cache is not used). Metadata  is
   stored in any case and for 'A' and 'M' also new data chunks are stored. For 'U' all data chunks refer to already existing chunks.

   • 'A' = regular file, added (see also a_status_oddity in the FAQ)

   • 'M' = regular file, modified

   • 'U' = regular file, unchanged

   • 'C' = regular file, it changed while we backed it up

   • 'E' = regular file, an error happened while accessing/reading this file

   A lowercase character means a file type other than a regular file, borg usually just stores their metadata:

   • 'd' = directory

   • 'b' = block device

   • 'c' = char device

   • 'h' = regular file, hardlink (to already seen inodes)

   • 's' = symlink

   • 'f' = fifo

   Other flags used include:

   • 'i' = backup data was read from standard input (stdin)

   • '-' = dry run, item was not backed up

   • 'x' = excluded, item was not backed up

   • '?' = missing status code (if you see this, please file a bug report!)

Reading backup data from stdin

   There are two methods to read from stdin. Either specify - as path and pipe directly to borg:

      backup-vm --id myvm --stdout | borg create REPO::ARCHIVE -

   Or  use  --content-from-command to have Borg manage the execution of the command and piping. If you do so, the first PATH argument is interpreted as command to execute and any further
   arguments are treated as arguments to the command:

      borg create --content-from-command REPO::ARCHIVE -- backup-vm --id myvm --stdout

   -- is used to ensure --id and --stdout are not considered arguments to borg but rather backup-vm.

   The difference between the two approaches is that piping to borg creates an archive even if the command piping to borg exits with a failure. In this case, one can end  up  with  trun‐
   cated  output  being  backed  up.  Using --content-from-command, in contrast, borg is guaranteed to fail without creating an archive should the command fail. The command is considered
   failed when it returned a non-zero exit code.

   Reading from stdin yields just a stream of data without file metadata associated with it, and the files cache is not needed at all. So it is safe to disable it via --files-cache  dis‐
   abled and speed up backup creation a bit.

   By default, the content read from stdin is stored in a file called 'stdin'.  Use --stdin-name to change the name.

Feeding all file paths from externally

   Usually,  you  give  a  starting  path  (recursion root) to borg and then borg automatically recurses, finds and backs up all fs objects contained in there (optionally considering in‐
   clude/exclude rules).

   If you need more control and you want to give every single fs object path to borg (maybe implementing your own recursion  or  your  own  rules),  you  can  use  --paths-from-stdin  or
   --paths-from-command (with the latter, borg will fail to create an archive should the command fail).

   Borg supports paths with the slashdot hack to strip path prefixes here also.  So, be careful not to unintentionally trigger that.