There is a new version of this tutorial available for Ubuntu 22.04 (Jammy Jellyfish).

High-Availability Storage with GlusterFS on Ubuntu 18.04 LTS

Glusterfs is a scalable network filesystem with capabilities of scaling to several petabytes and handling thousands of clients. It's an open source and distributed file system that sets disk storage resources from multiple servers into a single namespace. It's suitable for data-intensive tasks such as cloud storage and data media streaming.

In this tutorial, I will show how to set up a high availability storage server with GlusterFS on Ubuntu 18.04 LTS (Bionic Beaver). We will use 3 ubuntu servers, 1 server as a client, and 2 others as a storage. Each storage server will be a mirror of the other, and files will be replicated across both storage servers.

Prerequisites

  • 3 Ubuntu 18.04 Servers
    • 10.0.15.10 - gfs01
    • 10.0.15.11 - gfs02
    • 10.0.15.12 - client01
  • Root Privileges

What we will do?

  1. GlusterFS Pre-Installation
  2. Install GlusterFS Server
  3. Configure GlusterFS Servers
  4. Setup GlusterFS Client
  5. Testing Replicate/Mirroring

Step 1 - GlusterFS Pre-Installation

The first step we need to do before installing glusterfs on all servers is configuring the hosts' file and add GlusterFS repository to each server.

Configure Hosts File

Log in to each server and get the root access with 'sudo su' command, then edit the '/etc/hosts' file.

vim /etc/hosts

Paste hosts configuration below.

10.0.15.10 gfs01
10.0.15.11 gfs02
10.0.15.12 client01

Save and exit.

Now ping each server using the hostname as below.

ping -c 3 gfs01
ping -c 3 gfs02
ping -c 3 client01

Each hostname will resolve to each server IP address.

Configure the hosts file

Add GlusterFS Repository

Install the software-properties-common package to the system.

sudo apt install software-properties-common -y

Add the glusterfs key and repository by running commands below.

wget -O- https://download.gluster.org/pub/gluster/glusterfs/3.12/rsa.pub | apt-key add -
sudo add-apt-repository ppa:gluster/glusterfs-3.12

The command will update all repositories. And we've already added the glusterfs repository to all systems.

Add GlusterFS Repository

Step 2 - Install GlusterFS Server

In this step, we will install the glusterfs server on 'gfs01' and 'gfs02' servers.

Install glusterfs-server using the apt command.

sudo apt install glusterfs-server -y

Now start the glusterd service and enable it to launch everytime at system boot.

sudo systemctl start glusterd
sudo systemctl enable glusterd

Glusterfs server is now up and running on the 'gfs01' and 'gfs02' servers.

Check the services and the installed software version.

systemctl status glusterd
glusterfsd --version

Install GlusterFS Server

Step 3 - Configure GlusterFS Servers

Glusterd services are now up and running, and the next step we will do is to configure those servers by creating a trusted storage pool and creating the distributed glusterfs volume.

Create a Trusted Storage Pool

From the 'gfs01' server, we need to add the 'gfs02' server to the glusterfs storage pool.

Run the command below.

gluster peer probe gfs02

Now we will see the result 'peer probe: success', and we've added the 'gfs02' server to the storage trusted pool.

Check the storage pool status and list using commands below.

gluster peer status
gluster pool list

And you will see the 'gfs02' server is connected to the peer cluster, and it's on the pool list.

Create a Trusted Storage Pool

Setup Distributed GlusterFS Volume

After creating the trusted storage pool, we will create a new distributed glusterfs volume. We will create the new glusterfs volume based on the system directory.

Note:

  • For the server production, it's recommended to create the glusterfs volume using the different partition, not using a system directory.

Create a new directory '/glusterfs/distributed' on each bot 'gfs01' and 'gfs02' servers.

mkdir -p /glusterfs/distributed

And from the 'gfs01' server, create the distributed glusterfs volume named 'vol01' with 2 replicas 'gfs01' and 'gfs02'.

gluster volume create vol01 replica 2 transport tcp \
gfs01:/glusterfs/distributed \
gfs02:/glusterfs/distributed \
force

Now we've created the distributed volume 'vol01' - start the 'vol01' and check the volume info.

gluster volume start vol01
gluster volume info vol01

And following is the result.

Setup Distributed GlusterFS Volume

At this stage, we created the 'vol01' volume with the type 'Replicate' and 2 bricks on 'gfs01' and 'gfs02' server. All data will be distributed automatically to each replica server, and we're ready to mount the volume.

Below the 'vol01' volume info from the 'gfs02' server.

volume info

Step 4 - Setup GlusterFS Client

In this step, we will mount the glusterfs volume 'vol01' to the Ubuntu client, and we need to install the glusterfs-client to the client server.

Install glusterfs-client to the Ubuntu system using the apt command.

sudo apt install glusterfs-client -y

Now create a new directory '/mnt/glusterfs' when the glusterfs-client installation is complete.

mkdir -p /mnt/glusterfs

And mount the distributed glusterfs volume 'vol01' to the '/mnt/glusterfs' directory.

sudo mount -t glusterfs gfs01:/vol01 /mnt/glusterfs

Now check the available volume on the system.

df -h /mnt/glusterfs

And we will get the glusterfs volume mounted to the '/mnt/glusterfs' directory.

setup GlusterFS client

Additional:

To mount glusterfs permanently to the Ubuntu client system, we can add the volume to the '/etc/fstab'.

Edit the '/etc/fstab' configuration file.

vim /etc/fstab

And paste configuration below.

gfs01:/vol01 /mnt/glusterfs glusterfs defaults,_netdev 0 0

Save and exit.

Now reboot the server and when it's online, we will get the glusterfs volume 'vol01' mounted automatically through the fstab.

Step 5 - Testing Replicate/Mirroring

In this step, we will test the data mirroring on each server nodes.

Mount the glusterfs volume 'vol01' to each glusterfs servers.

On 'gfs01' server.

mount -t glusterfs gfs01:/vol01 /mnt

On 'gfs02' server.

mount -t glusterfs gfs02:/vol01 /mnt

Now back to the Ubuntu client and go to the '/mnt/glusterfs' directory.

cd /mnt/glusterfs

Create some files using touch command.

touch file01 file02 file03

Test replication

Now check on each  - 'gfs01' and 'gfs02' - server, and we will get all the files that we've created from the client machine.

cd /mnt/
ls -lah

Here's the result from the 'gfs01' server.

Result from server 1

And here's the result from the 'gfs02' server.

Result from server 2

All files that we created from the client machine will be distributed to all the glusterfs volume node servers.

Share this page:

11 Comment(s)

Add comment

Please register in our forum first to comment.

Comments

By: afelixpatton

hi, many thanks for article. It's very elucidated.

I'd like to know about the performance. In case I have more servers working as bricks, with this configuration you taught us with time nd more archives the performance can be affected. Can all of this became slower? And the type of the volume, in this case, replicate. Are there any others?

 

By: Kaleb KEITHLEY

For Ubuntu, using the Launchpad PPAs, this step:

  wget -O- https://download.gluster.org/pub/gluster/glusterfs/3.12/rsa.pub | apt-key add -

is completely unnecessary. The Ubuntu .debs in the gluster PPA are signed by Launchpad with their key. Follow the Launchpad instructions for installing their key.

You only need the key from download.gluster.org to install the Debian (jessie, stretch, buster) .debs from the apt repos on download.gluster.org.

By: mohammad

Follow the instruction from the official documentation.

https://docs.gluster.org/en/latest/Install-Guide/Install/

By: Julian

Thank you so much for your article :)

By: Maycon

I did these steps but not work for me when I try to mount:

Command:mount -t glusterfs 192.168.100.104:/mnt/glusterfs /mnt/glusterfsLog:

[2018-09-21 21:04:38.142685] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=/mnt/glusterfs --volfile-server=192.168.100.104 /mnt/glusterfs)

[2018-09-21 21:04:38.147454] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled

[2018-09-21 21:04:38.147498] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread

[2018-09-21 21:04:38.150022] E [glusterfsd-mgmt.c:1574:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server

[2018-09-21 21:04:38.150046] E [glusterfsd-mgmt.c:1674:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/mnt/glusterfs)

[2018-09-21 21:04:38.151085] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x105) [0x7f8381904e15] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_handle_reply+0x90) [0x7f8381904c10] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x449) [0x7f8381fbbd09]))) 0-: received signum (0), shutting down

 

[2018-09-21 21:04:38.151104] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt/glusterfs'.

By: Nguyen Truong Giang

Thank you so much for your article!

But how about your opinion if using GlusterFS as Datastorage to store big data? And can we create RAID with hard disk use in 2 server?

By: Shaun

If you mount the volume on the client using:

gfs01:/vol01 /mnt/glusterfs glusterfs defaults,_netdev 0 0

Then how is that High-Availability if gfs01 goes down?

By: Munir

when i run this command: 

gluster volume create vol01 replica 2 transport tcp \ gfsnode01:/glusterfs/distributed \ gfsnode02:/glusterfs/distributed \ force

i get this error:

volume create: vol01: failed: Staging failed on gfsnode02. Error: Failed to create brick directory for brick gfsnode02:/glusterfs/distributed. Reason : No such file or directory 

By: David Hednry

Thanks for a clear article.  I am having trouble accessing gluster URL's, most giving a 404 error (July 2020). Do you have a current address?

I need to implement a mirrored file server over 3 remote sites and gluster seems to fit the bill. Do you have experience of remote servers and potential reliability/latency problems?

By: Caoyang

I tried this on Nov 23 2020 with gluster-7 and it worked flawlessly. I was able to get gluster running in 10 minutes and it paired nicely with my SLURM cluster. If you are a linux admin (or doing admin stuff), you will know very few tutorial out there "just work". This happens to be one of them and it stood the test of time. Really nice article!

By: Jeff Baker

Hi,

What would happen if one of the servers died / went off line for a while  then came back on-line again. 

i.e server 2 may have changed whilst server 1 was down.

Would the 2 servers sync themselves ?

 

Thanks

Jeff.