How to use snapshots, clones and replication in ZFS on Linux

In the previous tutorial, we learned how to create a zpool and a ZFS filesystem or dataset. In this tutorial, I will show you step by step how to work with ZFS snapshots, clones, and replication. Snapshot, clone. and replication are the most powerful features of the ZFS filesystem.

ZFS Snapshots - an overview

Snapshot is one of the most powerfull features of ZFS, a snapshot provides a read-only, point-in-time copy of a file system or volume that does not consume extra space in the ZFS pool. The snapshot uses only space when the block references are changed. Snapshots preserve disk space by recording only the differences between the current dataset and a previous version.

A typical example use for a snapshot is to have a quick way of backing up the current state of the file system when a risky action like a software installation or a system upgrade is performed.

Creating and Destroying a ZFS Snapshot

Snapshots of volumes can not be accessed directly, but they can be cloned, backed up and rolled back to. Creating and destroying a ZFS snapshot is very easy, we can use zfs snapshot and zfs destroy commands for that.

Create a pool called datapool.

# zpool create datapool mirror /dev/sdb /dev/sdc 
# zpool list
NAME       SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
datapool  1.98G    65K  1.98G         -     0%     0%  1.00x  ONLINE  -

Now, we have a pool called datapool, next we have to create one ZFS filesystem to simulate the snapshot feature.

# zfs create datapool/docs -o mountpoint=/docs
# zfs list -r datapool
NAME            USED  AVAIL  REFER  MOUNTPOINT
datapool       93.5K  1.92G    19K  /datapool
datapool/docs    19K  1.92G    19K  /docs

To create a snapshot of the file system, we can use the zfs snapshot command by specifying the pool and the snapshot name. We can use the -r option if we want to create a snapshot recursively. The snapshot name must satisfy the following naming requirements:

[email protected]
[email protected]
# zfs snapshot datapool/[email protected]
# zfs list -t snapshot
NAME                     USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]      0      -  19.5K  -

A snapshot for datapool/docs is created.

To destroy the snapshot, we can use zfs destroy command as usual.

# zfs destroy datapool/[email protected]
# zfs list -t snapshot
no datasets available

Rolling back a snapshot

For the simulation, we need to create a test file in the /docs directory.

# echo "version 1" > /docs/data.txt
# cat /docs/data.txt
version 1
# zfs snapshot datapool/[email protected]
# zfs list -t snapshot
NAME                     USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]     9K      -  19.5K  -

Now we change the content of /docs/data.txt

# echo "version 2" > /docs/data.txt
# cat /docs/data.txt
version 2

We can roll back completely to an older snapshot which will give us the point in time copy at the time snapshot was taken.

# zfs list -t snapshot
NAME                     USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]  9.50K      -  19.5K  -
# zfs rollback datapool/[email protected]
# cat /docs/data.txt
version 1

As we can see, the content of data.txt is back to the previous content.

If we want to rename the snapshot, we can use the zfs rename command.

# zfs rename datapool/[email protected] datapool/[email protected]
# zfs list -t snapshot
NAME                     USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]  9.50K      -  19.5K  -

Note: a dataset cannot be destroyed if snapshots of this dataset exist, but we can use the -r option to override that.

# zfs destroy datapool/docs
cannot destroy 'datapool/docs': filesystem has children
use '-r' to destroy the following datasets:
datapool/[email protected]
# zfs destroy -r datapool/docs
# zfs list -t snapshot
no datasets available

Overview of ZFS Clones

A clone is a writable volume or file system whose initial contents are the same as the dataset from which it was created.

Creating and Destroying a ZFS Clone

Clones can only be created from a snapshot and a snapshot can not be deleted until you delete the clone that is based on this snapshot. To create a clone, use the zfs clone command.

# zfs create datapool/docs -o mountpoint=/docs
# zfs list -r datapool
NAME            USED  AVAIL  REFER  MOUNTPOINT
datapool       93.5K  1.92G    19K  /datapool
datapool/docs    19K  1.92G    19K  /docs
# mkdir /docs/folder{1..5}
# ls /docs/
folder1  folder2  folder3  folder4  folder5
# zfs snapshot datapool/[email protected]
# zfs list -t snapshot
NAME                  USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]      0      -    19K  -

Now we create a clone from the snapshot datapool/[email protected]

# zfs clone datapool/[email protected] datapool/pict
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
datapool        166K  1.92G    19K  /datapool
datapool/docs    19K  1.92G    19K  /docs
datapool/pict     1K  1.92G    19K  /datapool/pict

The cloning process is finished, the snapshot datapool/[email protected] has been cloned to /datapool/pict. When we check the content of the /datapool/pict directory, the content should be same than /datapool/docs.

# ls /datapool/pict
folder1  folder2  folder3  folder4  folder5

After we cloned a snapshot, the snapshot can't be deleted until you delete the dataset.

# zfs destroy datapool/[email protected]
cannot destroy 'datapool/[email protected]': snapshot has dependent clones
use '-R' to destroy the following datasets:
datapool/pict
# zfs destroy datapool/pict

Finally we can destroy the snapshot.

# zfs destroy datapool/[email protected]
# zfs list -t snapshot
no datasets available

Overview of ZFS Replication

The basis for this ZFS replication is a snapshot, we can create a snapshot at any time, and we can create as many snapshots as we like. By continually creating, transferring, and restoring snapshots, you can provide synchronization between one or more machines. ZFS provides a built-in serialization feature that can send a stream representation of the data to standard output.

Configure ZFS Replication

In this section, I want to show you how to replicate a data set from datapool to backuppool, but it is possible to not only store the data on another pool connected to the local system but also to send it over a network to another system. The commands used for replicating data are zfs send and zfs receive.

Create another pool called backuppool.

# zpool create backuppool mirror sde sdf
# zpool list
NAME         SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
backuppool  1.98G    50K  1.98G         -     0%     0%  1.00x  ONLINE  -
datapool    1.98G   568K  1.98G         -     0%     0%  1.00x  ONLINE  -

Check the pool status:

# zpool status
  pool: datapool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        datapool    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0

errors: No known data errors

  pool: backuppool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        backuppool    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sde     ONLINE       0     0     0
            sdf     ONLINE       0     0     0

errors: No known data errors

Create a dataset that we'll replicate.

# zfs snapshot datapool/[email protected]
# zfs list -t snapshot
NAME                  USED  AVAIL  REFER  MOUNTPOINT
datapool/[email protected]      0      -    19K  -
# ls /docs/
folder1  folder2  folder3  folder4  folder5

It's time to do the replication.

# zfs send datapool/[email protected] | zfs receive backuppool/backup
# zfs list
NAME                USED  AVAIL  REFER  MOUNTPOINT
backuppool           83K  1.92G    19K  /backuppool
backuppool/backup    19K  1.92G    19K  /backuppool/backup
datapool            527K  1.92G    19K  /datapool
datapool/docs        19K  1.92G    19K  /docs
# ls /backuppool/backup
folder1  folder2  folder3  folder4  folder5

The dataset datapool/[email protected] has been successfully replicated to backuppool/backup.

To replicate a dataset to another machine, we can use the command below:

# zfs send datapool/[email protected] | ssh otherserver zfs recv backuppool/backup

Done.

Conclusion

Snapshot, clone, and replication are the most powerful features of ZFS. Snapshots are used to create point-in-time copies of file systems or volumes, cloning is used to create a duplicate dataset, and replication is used to replicate a dataset from one datapool to another datapool on the same machine or to replicate datapool's between different machines.

Share this page:

16 Comment(s)

Add comment

Please register in our forum first to comment.

Comments

By: Name

This article ........ Didnt bother reading it beyond first two line. Linux doesnt come with ZFS.

By: till

ZFS exists for Linux off course, not in the Kernel but as a separate package. If you would have read the tutorial then you would have seen the link in the first line to the first part of this series that describes how to install ZFS on Linux: https://www.howtoforge.com/tutorial/how-to-install-and-configure-zfs-on-debian-8-jessie/

By: Ramadoni

http://zfsonlinux.org

By: kees

Nice articles, thanks.

I think you forgot to make "pool: backuppool" etc a code block.

Found some links for Ubuntu:

https://wiki.ubuntu.com/ZFS

https://github.com/zfsonlinux/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem

I will give it a try when I have more time.

By: jhon

how safe is run a server on ZFS and snapshot/clone when running ?

there will NO sql / mail  data missed ?

By: Ramadoni

Snapshot is point-in-time copy , it is a mirrored copy of the state of the filesystem at the time you took the snapshot. 

For database, you can put database in the backup state before doing snapshot.

By: Tony

Hi,

I'm not sure how old this post is at this point, but I was wondering - if you've done a push of the zfs snapshot to a remote server, is there a straight-forward method to pull the data back?  For example, you've pushed the [email protected], a day later the main drive fails - you rebuild the system and the ZFS devices and you're ready to restore the data - do you just copy from the snapshots or is there a way to recover from snapshots on a different drive (or preferrably from a remote/different server)?

By: sjau

I've also been using ZFS for a while now on linux. You can push or pull:

[code]

zfs send pool/[email protected] | ssh [email protected] "zfs receive otherPool/Dataset"

[/code]

or

[code]

ssh [email protected] "zfs send pool/[email protected]" | zfs receive otherPool/Dataset

[/code]

 

It works both ways.

By: Zachary Dea

Would love to be proven wrong, but I think the last step, zfs sending to a remote backup server is not functional in ZOL at this time. Perhaps in a 0.7.0 package via unstable package but see https://github.com/zfsonlinux/zfs/issues/434 

I'm trying to do this very thing and recieving the same problems as here: https://superuser.com/questions/877186/zfs-send-recieve-over-ssh-on-linux-without-allowing-root-login

A shame.

By: Tangeek

Hi,

 

Didn't find any info about the date of the article, I'm assuming it's quite old but I just wanted to add something. You CAN browse a snapshot in read-only, even with ZFS on Linux. At the mounting point of each dataset, there's a .zfs folder which will contain a snapshot subfolder for each of them. They are read-only, of course.

Note that you WON'T see this folder, even with ls -a. You can cd into it though. Yup, you have to know it's there, but it is. :D

I think this saves a lot of headache when recovering files.

By: JC

This tutorial is not very understandable

What does

"Now, we have a pool called datapool, next we have to create one ZFS filesystem to simulate the snapshot feature."

mean?

What is

# zfs create datapool/docs -o mountpoint=/docs

doing exactly?

How am I doing this when I already have pools?

Do I need to create new directories for that?

Can the directories be put in another pool?

ect

 

By: Maluta

ZFS pools are used to store datasets. It's the datasets that get mounted - think of a dataset like an LVM volume, or a partition with a traditional filesystem on it.

So with ZFS first you create a pool which consists of one or more vdevd (a vdev can be a single disk, a mirror, or a raid of disks) - personally I have several machines running multiple mirror vdevs in a single pool. Then you create datasets on the pool, giving each dataset a mountpoint.

The command "zfs create datapool/docs -o mountpoint=/docs" will create a dataset called "docs" under the "datapool" pool, and set it's mount point to be "/docs" if the mountpoint doesn't exist it will be created.

By: maarten

Sorry to say but you have not made a back up of your data, you have made a back up of a snapshot, a snapshot is only the data that has changed in the original set sinds the snapshot

If you loose your original data, a snap shot will not save you !! , a replication of a snapshot wil lalso not save you

Look at the amount of data you are saving DataPool 527k vs Backup Pool 83k

This mistake is so widespread that i cant find a good tutorila in howto propperly Back up the original dataset anywhere on internet, even Oracle thinks the original dataset will never die...

please spread this dangorous pitfall

 

By: henning

Exactly. I'm still digging through man pages to find out how to replicate the entire pool, not just the snapshot, on a different drive/machine.

By: Andrea

What if you create a datapool and then before putting it to use you create a snapshot? Would this be enough to have a 1:1 copy of your data right?

I know it’s not feasible for existing pools, but in theory it should work...

By: edwin eefting

Take a look at ZFS autobackup if you need a complete solution that also can display the executed shell commands: https://github.com/psy0rz/zfs_autobackup