How To Build A Low Cost SAN - Page 4


12.1 About XEN-AoE

XEN-AoE system takes maximum advantage of high-performance commodity computing power and high efficiency IP-SAN technologies to deliver a virtual- ization solution which provides the highest availability, performance, and best value possible in low cost. Xen-AoE is a cluster server architecture which pro- vides more efficient use of cpu, disk, and memory resources than traditional servers with zero single points of failure ensuring high availability and greater server maintainability. It does this by utilizing:

  • CPU virtualization
  • Storage virtualization
  • SAN technologies
  • Decoupling of disk from CPU
  • Commodity 64 bit x86 CPUs
  • Commodity disk storage
  • Gigabyte ethernet infrastructure



We have three xen machines booted on xen kernel of FC7. If you do not have your pc with xen kernel, then please select the virtualization option while in- stallation and install your pc with XEN-kernel. The objective of this lab is to export the block devices from node1 to node2 and create a DOMU (guest OS) in node2. This DOMU will be created on exported block devices from node1 and it will contain a minimum debian linux and one can easily do live migration of this DOMU from the second node(node2) to third node(node3).


12.3 SETUP

The first step of XEN-AoE setup is to export two block devices of 4GB from node1 to node2:


Creation of two block devices of 4GB each

[root@node1 Desktop]# dd if=/dev/zero of=newfile1.img bs=1M count=4000
[root@node1 Desktop]# dd if=/dev/zero of=newfile2.img bs=1M count=4000
[root@node1 Desktop]# losetup /dev/loop0 newfile1.img
[root@node1 Desktop]# losetup /dev/loop1 newfile2.img
[root@node1 Desktop]# losetup -a


Creation of LVM on these two devices

[root@node1 Desktop]# pvcreate /dev/loop0
[root@node1 Desktop]# pvcreate /dev/loop1
[root@node1 Desktop]# pvs
[root@node1 Desktop]# vgcreate vgnode1.0 /dev/loop0
[root@node1 Desktop]# vgcreate vgnode1.1 /dev/loop1
[root@node1 Desktop]# vgs
[root@node1 Desktop]# lvcreate -L3G -n lvnode1.0 vgnode1.0
[root@node1 Desktop]# lvcreate -L3G -n lvnode1.1 vgnode1.1
[root@node1 Desktop]# lvs


Export these devices

[root@node1 vblade]# ./vbladed 0 0 eth0 lvnode1.0
[root@node1 vblade]# ./vbladed 0 1 eth0 lvnode1.1


Acess of these two block devices on node2 and creating an Raid array

[root@node2 Desktop]# modprobe aoe
[root@node2 Desktop]# aoe-stat
[root@node2 Desktop]# mdadm - -create /dev/md0 - -level=1 - -raid-devices=2

/dev/etherd/e0.0 /dev/etherd/e0.1

[root@node2 Desktop]# cat /proc/mdstat
[root@node2 Desktop]# mdadm - -detail /dev/md0
[root@node2 Desktop]# mkfs.ext3 /dev/md0


Creation of a Domu (containing debian linux) on raid array

Debootstrap is used to create a Debian base system from scratch, without re- quiring the availability of dpkg or apt. It does this by downloading .deb files from a mirror site, and carefully unpacking them into a directory which can eventually be chrooted into.

[root@node2 Desktop]# yum install debootstrap
[root@node2 Desktop]# mkdir /debian
[root@node2 Desktop]# mount /dev/md0 /debian
[root@node2 Desktop]# debootstrap - -arch i386 etch /debian


To get the console on DOMU

Please do some tricky things to get a console on your DomU. If you don't do this, it may be possible that you don't get a terminal. Your mingetty will respawn and it will not show you the screen. So, be careful to get the console. You can search on the net for available solutions:

[root@node2 Desktop]#cp /etc/passwd /etc/shadow /debian/etc
[root@node2 Desktop]#echo '/dev/sda1 / ext3 defaults 1 1' > /debian/etc/fstab
[root@node2 Desktop]#sed -ie 's/^[2-6]:/#/0/' /debian/etc/inittab


Making of a new initrd

Create a new initrd (initialization RAM Disk) without SCSI modules, and then use this to boot the guest Linux operating system. If you don't do this then your booting may be fail. In my case I got the message \kernel panic". Making of initrd can be achieved by following command:

[root@node2 Desktop]# mkinitrd - -omit-scsi-modules - -with=xennet - -with=xenblk - -preload=xenblk initrd-$(uname -r)-no-scsi.img $(uname -r)


Setup the debian Xen configuration file xm-script /etc/xen/debian:

kernel = `/boot/vmlinuz-2.6.21-7.fc7xen'
ramdisk = `/boot/initrd-2.6.21-7.fc7xen-no-scsi.img'
memory = `238'
name = `debian'
root = `/dev/sda1 ro'
extra=\console=tty1 xencons=tty1"
dhcp = `dhcp'
vif = [ ` ']
disk = [ `phy:md0,sda1,w ']
on poweroff = `destroy '
on reboot = `restart '
on crash = `restart '

Now Everything has done to create a new guest os with debian Linux.


Start your DomU

[root@node2 /]# umount /dev/md0
[root@node2 /]# xm create -c debian
[root@node1 /]# xm list

So, finally your Dom U has created with debian linux. If everything is fine then, you can get the terminal otherwise you can search on the net.



Now it's time to migrate this virtual machine on node3. So, run the following command:

[root@node2 ]#xm migrate - -live debian node3


13 Conclusion

Fiber Channel and iSCSI customers are often looking for more than just storage. AoE is a simple network protocol and its description is only eleven pages long, but it provides enough of a structure to build exible, simple, and scalable storage solutions from inexpensive hardware like disks and gigabyte switch. So, if you want a small San with no extra features in a low budget, then AOE is a better choice rather than iSCSI and fiber channel. As there always has been a huge demand for low cost and exible storage solutions, it is not feasible to increase the number of drives easily in either Fiber or iSCSI based Sans. The ATA over Ethernet (AoE) protocol solves this issue to a large extent. However Fiber channel solutions are the fastest of the three while iSCSI solutions are still most reliable.


14 Terms and Terminology

  • Target : In AoE we call servers as target.
  • Initiator : In AoE clients are known as initiators.
  • Shelf : This is like major no. of driver. AoE target is bind on a particular shelf.
  • Slot : This is like minor no. of driver. AoE target is bind on a particular slot.
  • Interface : AoE target identifies the ethernet device that a device is exported on. The default value is eth0 if no other value is set.
  • Mac-filtering: To restrict the ethereal frame for particular mac ad- dresses, Mac-filtering is used.
  • ACL : This is access control listing, which is used to set the access rights (like read/write) for a particular exported device. The access control list is also used to restrict the block device for particular mac-address. Combination of more than one mac address is known as ACL-list.
  • Buffers : Length of the request queue for the interface. Buffercount is specified number of 64kb buffers for receiving packets.
  • Sectors : Max no. of sectors per request.
  • Queue-length: No. of active i/o request.
  • Mtu : Force the specified MTU rather than the auto-detected value.
  • Path : Path of the block device or file to export.
  • Uuid : UUID stands for Universally Unique Identifier, it gives each filesys- tem a unique identifier. You can specify the device either by their path or by uuid.
  • Accept : A comma separated list of ACL-entries where the devices are allowed to receive the command from.
  • Deny : A comma separated list of ACL-entries from where the devices are not allowed to receive the command from.
  • Direct-IO : Direct I/O is a feature of the file system whereby file reads and writes go directly from the applications to the storage device, bypass- ing the operating system read and write caches.
  • Buffered-IO : Most file system I/O is buffered by the operating system in its file system buffer cache. The idea of buffering is that if a process attempts to read data that is already in the cache, then that data can be returned immediately without waiting for a physical I/O operation.
  • Write Cache : For the write-back setting, the operating system sends data to the controller to write to a storage device. Subsequently, the controller sends a confirmation to the operating system before actually writing the data to the storage device. Doing so increases performance, but also contains an element of risk. For example, if there is a power failure, the data currently in the controller cache is lost. This is no risk when using a controller with a battery-backup cache. The battery preserves the data in the controller cache in the event of a power failure. Turn on or off write cache, defaults to on.
  • Trace-io : If set to true, then all I/O requests received by the daemon will be logged. Note that this may produce a huge volume of messages on a heavy used server so this option should only be enabled for debugging purposes. Valid values are true and false. The default value is false.
  • Policy : There can be two type of policy based on the mac address and corresponding access-list: like accept, reject.
  • losetup : A command to treat loop device as a block device.
Share this page:

2 Comment(s)

Add comment


From: rasker

Hi, fascinating article. Thanks very much for writing it.

I was wondering what, in your opinion, is the most interesting target? This is with respect to performance and simplicity of configuration. Did you reach any conclusions in your testing?


I will  for vblade, ggaoed or qaoed depending on my need.