| Article Index |
|---|
| Xen mini-ISP architecture |
| Setting Basic Infrastruture |
| Installing a new VM |
| Network Infrastructure |
| Zones Security Model |
| Quick Start |
| Bugs, New Feature |
| All Pages |
While major ISP like Orange, Vodafone, Telfonica, ... have a numbers of independent cabinet to compartmentalize their architecture, this is not an option for small non profit or SOHO organizations that in no way justify such complex architectures and/or neither can afford it.
This note describes "Fridu-in-Xen" virtual ISP architecture, it explains how to simulate a fully compartmentalize mini ISP (Internet Service Provider) running on a unique Linux box and hosted on a cheap remote site (OVH ) leveraging XEN, VPN and QoS. It is build in such a way that anyone with an acceptable level of network and Linux knowledge should be able to replicate the architecture on its own hardware in few hours, then if you like it you may start contributing to the improvement process.
I use this architecture in real for fridu.org to support a number of non profit organizations, obviously it still lack some nice feature like redundancy, load balancing, supervision, ..., it's not that we could not extend "Fridu-in-Xen" to support it, but I have neither the time neither requirements for it. Nevertheless, in its current version it already allows you to provide for a very reduce cost of administration every typical services we expect from a good service provider Portal, Messaging, Voice/IP, ...this with an acceptable level of quality of service including security, backup, QoS, ...
Disclaimer
Anything I wrote here was done outside of my professional work context and none of my current/past employers/customers have participate or even be consulted for this work. Fridu is 100% part of my free time, and everything including hosting is funded on our pocket money and used to support non commercial friend organisations. While I think I have the technical background to design a smart architecture (cf:my profile). I nevertheless do not garanty that it will work for you, or even that you will agree with me. I still hope it may help some of you and I would be more than happy to incorporate improvement if ever you have some.
Introduction
For professional reason I was been lead architect for few major internet service providers in Europe, while obviously most of us do not requirer this level of architecture complexity, and without taking about majors GSM-operators and/or world wide contend providers that count users in tens or hundred of million, most of us will never have to deal with even a small million of users.
This being said, even if you support only few tens/hundred of users it is very nice to have a compartmentalized architecture, where you can start a new instance of Linux for a test server in few minutes and/or do a disaster recovery without sweating for hours.
Fridu-In-Xen architecture was design, with the same security constrains and administration concern than big ISPs, only replacing physical element with virtual equivalence. The fundamental requirement was to build an architecture that was as close as possible for the administrator of a traditional physical architecture. I still lack some features like fault tolerance or load balancing, we could obviously add it, and I have no doubt that the same paradigm would work, but within Fridu usage context I have not justification for such an effort.
When using Fridu-In-Xen type of architecture? obviously when ever you want :) This being said it was designed after few major crashes we had one box we used for hosting fridu in the past. Each crash generated best case one night of work and worse case a full weekend. We also had trouble to upgrade some services because of the interdependency of all applications running together in a shared cabinet and when ever we had to reboot after a security update it was a nightmare. Last but not least, because Fridu is 100% based on volunteer work, everyone tend to cook its own food before eating it. Unfortunately this lead in, far too many people with far too many access right, and when a problem raise you never know who made it.
Conclusion: if you want a small and simple architecture than allows you to:
- Add/Recreate an instance of Linux is less than 5mn
- Provide untrusted user with root access to their VM without sweating
- Keep a complete control of your infrastructure without dealing with the detail of each VM
- Have access to a simple administrative network, that makes transparent the fact that you share a physical cabinet with others.
- Have control on QoS (Quality of service) at network, Ram, Disk level
- Handle smartly multiple external NAT IP addresses.
- ...
Then you should probably check Fridu-In-Xen. Alternatively your may want to check Fridu OpenVZ (here)
ISP typical architecture
Most serious Internet service providers are organized in zones, even if they may use different name like: clouds, classes, .... Each zone groups a couple of machines that share both a common level of security and a class of service. In order to guaranty high availability a zone must at least have two physical boxes, but depending on provided services and load a given zone may have ten or more physical/virtual boxes. Within a zone load is balanced, today load-balancing is mostly achieve by hardware intelligent switch/routers like Alteon, Cisco, ... all those modern hardware provide a very secure way of isolating port by VLAN and outside of some very paranoiac users l(bank, government, ...) most people will accept to share the same physical routers even for different security level. Administration is typically done through a completely independent network, that have a multi-legs firewall in order to make sure that even if a hacker take control over one zone, he cannot use the admin network to move from one zone to the other. Following graphic give a typical view of current major ISP architectures, and depending on the number of physical boxes could be extended from few thousand to few million users.
Fridu Xen Architecture
As explained in the introduction, the goal of Fridu Xen architecture is to provide for minimal fraction of the cost the same level of administration and security facilities, we do have with a conventional ISP architecture. Nevertheless we have to accept some limits, especially hight availability will be reduce has we only have one physical box, obviously we could have two of them and use some software load balancing methods, but as today this is not part of Fridu-in-Xen reference model (may be later :)). Nevertheless looking at today hardware stability, associate to facilities offer by virtualization for backup, restore and relocation of virtual machines this architecture model attached to a good backup strategy and disaster recovery plan, should be more than enough for medium size operation, this especially for non profit organizations that what ever they need do not have funding for conventional ISP type of model.
Setting up basic Xen infrastructure
This guide is not a Xen quick start and I expect at this level that you already have Xen/domU up and running. If you're making test on a local workstation then installing XEN is quite straight forward with any current major distributions. With OpenSuSE this is as simple as going in YaST and click on Xen tab for Kubuntu an other Debian flavors aptget will do the job for you. Nevertheless If you are in a remote hosting environment, then it might unfortunately be more tricky, I posted a note on how to make this within OVH remote hosting environment that is available from here and that hopefully should be reusable within most of other hosting facilities.
Default Xen standard distribution provide three network scripts, unfortunately none of them is doing the we want we unfortunatly had to build one, that is downloadable from here Fridu-Script. Assuming that you install it at default /etc/xen/scripts location, you then need to update your xend-config file to call it. Note xend normal script only create only one bridge, this one will create as many bridge as needed.
| # # Xend configuration sample for Fridu-In-Xen # # The hosts allowed to talk to the relocation port. (xend-relocation-hosts-allow '^localhost$ ^localhost\\.localdomain$') # Start Fridu-In-Xen Script to Build 3 Zones (network-script 'Fridu-network.script xen-br1:10.10.1.1:255.255.255.0 xen-br2:10.10.2.1:255.255.255.0 xen-br3:10.10.3.1:255.255.255.0') # We use Xen standard bridge stript (vif-script 'vif-bridge bridge="xen-br1"') # dom0-min-mem is the lowest memory level (in MB) dom0 will get down to. (dom0-min-mem 196) # If dom0-cpus = 0, dom0 will take all cpus available (dom0-cpus 0) |
You need one bridge per zone, but it is not mandatory to build them from xend. For testing purpose it is often easier to create/destroy them manually with Fridu-network.script as shown in next table and check result with ifconfig command.
/etc/xen/scripts/fridu-in-xen.sh start xen-br1:10.10.13.1:255.255.255.0 ifconfig xen-br1 |
Note: Xend when stopping does not remove unused bridges, it is not a problem, but may lead to confusion when making test. You should know that you don't have to reboot to remove bridges, Fridu script does not touch your main NIC also as soon as xend is stopped you can safely remove all unnecessary bridges. Note than default Xen network bridge rename and attach you main NIC, making deletion of bridges more sensitive. In order to clean bridge use following commands:
- /etc/init.d/xend stop ;# make send is stopped
- brctl show xen-br1 ;# make sure they is no interface attached
- ifconfig xen-br1 down ;# stop bridge interface name xen-br1
- brctl delbr xen-br1 ;# delete old bridge
Making the assumption that Xen is now ready and working check with "xm list" that domu-0 is running. If everything is OK you're ready for next phase.
Installing a new VM
This chapter explains step by step how to implement a Xen-VM under LVM. Obviously Fridu-LVM script will do all of this automatically for you, but if you want/have to be in a position to debug, when things go wrong, then you may want to know what is under the cover, automatic mode is explain later in Quickstart chapter.
While this is not mandatory, at least for production I deeply recommend you to implement each Xen's VM with three LVMs: first one for root, second one for swap+tmpfs and last one for logs. Keep Xen sparse image file for test only, in production sparse images have absolutely only have disadvantages. Some may argue that LVM model is more complex to setup, but as Fridu scripts does the job automatically, who cares.
Building you VM root image
As the target of this post is not to explain how to run a basic Xen; in order to save time I propose you to download a pre-build Xen root file system, this image is nothing special. It is an out of the box YaST2 "in-directory" OpenSuSE-10.2 install, and contains a basic English/TextMode OpenSuSE-10.2 root. Nevertheless it is ~350M B also depending on your DSL link you may want to rebuilt it yourself, but following explanation make the assumption you have something equivalent (Download Xen-OpenSuSE-VM )
Outside of performance reason, I see significant advantages to LVM over sparse image.
Creating you LVM virtual disk
- we can separate tmp,swap,log very easily, which make the image to save much smaller and may save a lot time during a disaster recovery.
- in case of trouble mounting an LVM is much more simple than mounting a sparse image through a loopback device.
- we can leverage LVM extend/reduce capability including online extend for reiserfs (has trouble to believe it, but ext3 does not support this !!!)
- you can do the fsck directly from the main domain in case of trouble.
- ....
In order to create an LV (logical Volume) you need to have an active VG(volume group) on your system, if this is not the case you need to dedicate one or more physical partitions to LVM. This guide is is a go to production strategy and make the assumption that you have a working LVM with 6G free, if this is not the case please build one before moving forward.
| # Fist check the you have an active volume group. vgscan > Reading all physical volumes. This may take a while... > Found volume group "SATA-160GA" using metadata type lvm2 # We have a volume group named "SATA-160GA" # If LVM is running but you have no Volume group but have a free partition vgcreate volumeGroupName /dev/YourPhysicalPartition-1 vgscan ;# should now found your VG. # Note: on OpenSuSE YaST will create VG and LVM for you. # to make our life more simple let's create few variables VGname=/dev/xxxxx (xxxx= what ever you gor from vgscan) MY_SERVICE=yyyy (what ever you want test,mail,opensuse, ....) # Create two one for root in ext3 and one for swap+tpms files. LVroot=$MY_SERVICE-root ;# (your LVM root name) lvcreate -L 5G --name $VGname/$LVroot ;# (root size=5G) # create swap+tmpsfs zone LVswap=$MY_SERVICE-swap ;# (your lvm swap name) lvcreate -L 1G --name $VGname/$LVswap ;# (root size=1G) # create swap-zone mkswap /$VGname/$LVswap # Create $MY_SERVICE-root filesystem and mount it on /mnt mkfs.ext3 /$VGname/$LVroot mount /$VGname/$LVroot /mnt # restore XM root filesystem cd /mnt tar -xzf /export/space/vm/opensuse-102-lvm.tgz # make sure we wont need root password to connect (VERY importance when you don't know it) cp ~/.ssh/id_rsa.pub /mnt/root/.ssh/authorized_keys chroot /mnt /bin/bash ;# change root on our future VM mount /dev ;# need this for random number /etc/init.d/random start ;# start random number generation passwd ;# change VM root password umount /dev ;# do not forge this our umount /mnt will fail ** if you do not have a id_rsa.pub build it with "ssh-keygen -t rsa -b 1024" ** alternativly copy root /etc/shadow line and replace in /mnt/etc/shadow # do anything else you may want to do before booting umount /$VGname/$LVroot |
Building you VM config file starting your VM
Your VM is now almost ready to boot, we still to check a couple of things
- VM config file (sample here )
- VM Xen kernel+initrd (sample here )
Kernel+Initrd can be place anywhere on your disk, configuration file need tiny adjustment to reflect your configuration, While it is possible to mount image and make a copy before launching the VM, I found out that it is finally more simple to place a copy of your VM kernel+initrd somewhere on disk.
- disk = should point on your root+swap LVM is you do not have CD image remove hdd
- kernel = "/mypath/vm/boot/vmlinuz-xen"
- ramdisk = "/myPATH/vm/boot/initrd-xen"
**We are ready to start our new VM, if you did not mess up your ssh config you should be able to connectwith the same password as the one from your DomU.
| # Create your VM xm create /etc/xen/xm/fridu-in-XEN-sample.conf >Using config file "/etc/xen/vm/fridu-in-XEN-sample.conf". >Started domain Fridu-In-Xen # Connect on VM console xm console Fridu-In-Xen ;# of what ever is your domaine name > .... depending how fast you connect the console > Sarting mail service (Postfix) done > Starting CRON daemon done > Master Resource Control: runlevel 3 has been reached > Failed services in runlevel 3: network > Skipped services in runlevel 3: irq_balancer nfs splash > >Welcome to openSUSE 10.2 (i586) - Kernel 2.6.18.2-34-xen (tty1). > xen-test login: root > Password: XXXX ;# what ever you entered when VM was off ** Ctrl-] to quit console |
You have now a fully working Xen VM but has you have probably notice network fail to connect this is normal has do not have get set up out network infrastructure.
Network Instrastructure
This chapter describe the necessary step to make your VM to receive a valid IP address from DHCP. All described step are obviously done automatically by Fridu script.
If you have run "/etc/init.d/xend start" after updating your config with Fridu-network script you should have a xen-br0 or what ever you choose has default bridge name. This bridge should be active with a valid local IP adress (ex: 10.10.1.1) . Now what we want if to have our VM to receive automatically a valid IP address within this bridge IP/netmask range.
I've try many options to provide VM ip address from domU, but outside of hacking startup scrip inside the VM, I did not find any smart mechanism to do the job, and as adding dhcp=dhcp seem not to work (at least on OpenSuSE), the only option is to provide a valid DHCP config for Xen VM virtual NIC.
Make you VM DHCP aware
At this level you have to make sure that within /etc/xen/vm/MyVMconfig, you provided a MAC address to your "vif" interface definition. If you forgot to do so, your NIC mac address wiil change at every boot, forcing Linux to rename eth0 to something different. You can stop this by "FORCE_PERSISTENT_NAMES" value inside /etc/sysconfig/network/config on the VM or by preventing udev to run on the VM.
| # The easiest network config is to provide a fix MAC to your VM vifgrep vig /etc/xen/vm/MyVM.conf > vif = [ 'mac=aa:bb:cc:dd:ee:03, bridge=xen-br0, vifname=xen-com' ] # Then you need that eth0 of what ever is your network name on the VM is DHCP enablecat /etc/sysconfig/network/ifcfg-eth0 > BOOTPROTO='dhcp' > NAME='Xen Virtual Ethernet card 0' > STARTMODE='auto' >USERCONTROL='no' |
Install a local DHCP server.
As the best option is to use DHCP for our VM network, we need a local DHCP. As we do not need anything big dnsmasq is the perfect product for our needs, it is very small does not require any special configuration. It takes its config values directly from /etc/hosts and /etc/ethers and also act has a local DNS, plus Internet cache. You probably won't even have to recompile it, and while it is not part of default OpenSuSE DVD it is nevertheless very easy to find on any good rpm repository like rpmfind.net. You can download my sample config for dnsmaq from Here.
| # install binary version from rpm or aptget > rpm --install dnsmasq-2.35-8.i586.rpm # If you start from my basic config the only line you should check is > dhcp-range=10.10.12.100,10.10.12.150,255.255.255.0,12h # make sure your VM is in both /etc/hosts and /etc/ethers > cat /etc/hosts > ... > 10.10.1.1 BR-01 ntp dhcp xen > 10.10.1.2 VM-02 > 10.10.1.3 VM-03 > cat /etc/ethers > ... > AA:BB:CC:DD:EE:02 VM-02 > AA:BB:CC:DD:EE:03 VM-03 # start dnsmasq > /etc/init.d/dnsmasq start # check your dnsserveur is listening your bridgeInterface > netstat -na | grep 53 | grep udp > udp 0 0 0.0.0.0:53 0.0.0.0:* # connect on your VM and check for DHCP > xm console My-VM-Domain > login: root > password: xxxxx > dhcpcd-test eth0 > dhcpcd: MAC address = aa:bb:cc:dd:ee:03 > IPADDR=10.10.1.3 > NETMASK=255.255.255.0 > NETWORK=10.10.1.0 > BROADCAST=10.10.1.255 > GATEWAY=10.10.1.1 > HOSTNAME='common' > .... # if not working !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! # check that your DHCP request is moving from your VM to Dom-0 tcpdump -i xen-vm02 ;# what ever name you choose for VM Interface > tcpdump: WARNING: xen-tst: no IPv4 address assigned > listening on xen-tst, link-type EN10MB (Ethernet), capture size 96 bytes > 01:33:46.532650 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, > 01:33:46.533101 IP br0.bootps > common.bootpc: BOOTP/DHCP, Reply, length: 310 > 01:33:46.533292 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP > 01:33:46.583977 IP br0.bootps > common.bootpc: BOOTP/DHCP, Reply, length: 319 > 01:33:46.584365 arp reply common is-at common |
Your VM should have a working internal IP adress, and you should be able to ping from your VM to Dom-0 and vice versa. As you still have no NAT(Network-Address-Translation) and no routing, you are constrain to limit yourself within your box and cannot reach Internet. Nevertheless you should be able to ping any local address including any other VM, if you have some.
Notes:
** If DHCP-TEST is working but /etc/init.d/network restart does not provide you with any adress then you need to remove checksum from DHCP. On VM /etc/sysconfig/network/dhcp DHCLIENT_UDP_CHECKSUM="no"** when making test, I had few froze of my VM vif interface, result is that DHCP does not work. If tcpdumping corresponding interface for that given VM on dom-0 we do not see any packet. Ifconfig up/down will not solve the problem, but restarting the same VM with a different vif name should work :(
Security
The chapter describe Fridu reference architecture security model, unfortunatly the way iptables work make a manual step by step operation guide useless, and I make the assumption that user will generate iptables rules automatically, obvisouly script can dump iptables commands allowing anyone to double check what it going one.
Fridu-in-Xen security model is designed to be very simple to administrate. Firewall iptables are generated automatically through a small parser script and the administrator only have 3 rules to handle. This describe how to implemented security before you reach a given VM. Obviously each virtual machine may later have its own set of firewall rules, but this is out of scope of this guide. You should look Fridu-in-Xen security model as the equivalence of what a network/infrastructure team is providing inside a traditional Telco.
Note: Firewall has been extended to support other virtualization environment like OpenVZ or VirtualBox and has now its own page (here)
QuickStart
Most of us are love to try without any learning curve :) Making the assumption that you have acceptable administation skills and root password of your box, starting Fridu-in-Xen architecture should take you from less than one hour to probably almost one day. When in place, creation of a new VM take excatly one minute, this for a basic but fully YaST compliant OpenSuSE-10.2.
Creating VM automaticlly
Two components will be use for this quickstart Fridu-scripts and a minimal OpenSuSE-102 distribution. Within scrips archive lvm-script is mandatory, it is the one that does automatically every step described in chapter "Installing a VM", second one firewall-script is mandatory only if for Fridu security model. Nevertheless and what ever you will chose to implement later, for a real quickstart I recomment you to use a default configuration in order to maximize "cut and paste". When your first automatically created VM will run, then it will be time for you to taillor this to your own distribution and/or network strategy, starting by trying an Ubuntu in a domu virtual machine. On Dom0 I tested Fridu-Script with OpenSuSE-10.2 Fedora-core5 and Ubuntu-6.1/server and I would expect it to work out of the box on almost any distribution.
Action:
- get OpenSuS-102.tgz somewhere on a disk visible from your Dom0
- untar Fridu-scripts archive and type edit ./install.sh to reflect your configuration and install scripts and samples.
- edit /etc/sysconfig/Fridu-lvm.config to reflect your site preferences.
Assuming that you have a basic XEN configuration in place and a free LVM volume group starting Fridu-lvm.script should you take less than 15mn. Then time to create a new VM with a basic OpenSuSE-102 take on my system 180s, this from typing "return" on script create command to the point where my VM anwser to Dom-O ping.
Start xend with Fridu Network Script
| # if using Fridu security model you need to update xend network script. File: /etc/xen/xend-config.sxp Line: (network-script 'Fridu-network.script xen-br0:10.10.1.1:255.255.255.0') # start xen service /etc/init.d/xend start > Starting xend 'done' # check your network bridge is working ip link show xen-br0 > 12: xen-br0: <BROADCAST,UP,10000> mtu 1500 qdisc noqueue > link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff brctl show > bridge name bridge id STP enabled interfaces > xen-br0 8000.000000000000 no |
Note: Fridu script for VM creation will work with standard XEN brigde script (/etc/xen/scripts/network-bridge) , nevertheless Fridu security model leveraging zones requirer Fridu-Xen-network.script. The main difference in between both model is that Fridu bridge VMs by zone and later route in between zones, when default Xen script bridge your all your VM to your external NIC interface.
Check you have enough free space on your volume group disk.
You MUST have a working LVM environment to use Fridu VM creation sript, default configuration will create two Logical Volumes per VM with respectively root=5G for swap/tmpfs=1G. Note than while automatic creation of VM only work with LVM, zone security model is independent on disk layering and work perfectly with sparse image file.
| vgscan > Reading all physical volumes. This may take a while... > Found volume group "SATA-160GA" using metadata type lvm2 vgdisplay "SATA-160GA" | grep -i free > Free PE / Size 9641 / 37,66 GB |
Create your site config file
While every single parameter of VM creation configuration can be defined/overloaded at command line, it is usually a good practice to provide some adequate site default values, this especially for almost static element as: kernel location, bridge name, ...
| # Minimal site config file# --------------------------------- lvmvg=SATA-160GA ; # my volument group name without /devxenkerneldir=/export/space/xen/boot ;# my xen VM kernel+initrdlinuxroot=/export/space/xen/vm/opensuse-102-lvm.tgz ;# my linux distrib root backupvmbr=xen-br0 ;# default bridge namevmram=258 ;# default memory allocated to VM## Check Fridu-lvm.config samples ship with script for complete liste of possible parameters. |
Creating your first VM
If you have a correctly customized config file and stick to its default name. Then in order to create a new virtual machine, you only have to provide "name+ip-address". Obviously you may want to customize more things as: network mac adress, root filesystem, size, ... but for a basic test those two parameters should be more than enough.
| Fridu-lvm.script create verbose=1 vmname=xen-test vmip=10.10.12.4 > Logical volume "xen-test-root" created > Logical volume "xen-test-swap" created > .. CreateFileSystem check if /dev/SATA-160GA/xen-test-root is not formated > .. CreateFileSystem building root dir=/dev/SATA-160GA/xen-test-root fs=ext3 > mke2fs 1.39 (29-May-2006) > .. CreateFileSystem check if /dev/SATA-160GA/xen-test-swap is not formated > .. CreateFileSystem building swap dir=/dev/SATA-160GA/xen-test-swap > Initialisation de la version de l'espace de swap 1, taille = 5368705 kB > .. InstallRoot mounting /dev/SATA-160GA/xen-test-root on /tmp/fridu-mnt > /dev/SATA-160GA/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0 > .. InstallRoot untar of /export/space/xen/vm/opensuse-102-lvm.tgz in /dev/SATA-160GA/xen-test-root > .. TuneRoot mounting /SATA-160GA/xen-test-root on /tmp/fridu-mnt > /dev/SATA-160GA/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0 > .. TuneRoot make VM root/passwd == to current dom-0 > .. TuneRoot make sure dom-0 can ssh root@vm without any passwd > .. TuneRoot remove udev persistend mode to keep eth0 when vif/MAC change > .. TuneRoot remove checksump check from DHCP client > .. TuneRoot make sure VM will not try to update hardware clock > .. TuneRoot eth0 to support DHCP > .. CreateMac reusing existing vmmac=aa:bb:cc:dd:80:11 for vmname=xen-test > -- > Congratulation xen-test is now ready to start |
When using verbose=1 as in previous example you get a trace of each step the script is going trough. If needed you can restart the process step by step. Bypassing a step is done by using the option "stepname=no" in command line. By default value is "yes" for every steps except if obvisouly you changed this in your config file.
- Create VM 's root+tmpfs/swap logical volume group (lvmcreate=yes)
- Format root LV with a filesystem (default=ext3) (fscreate=yes)
- Untar your root distribution root LV (installroot=yes)
- Custumize root distribution to run dhcp client, and few other things (tuneroot=yes)
- Update your host /etc/hosts and /etc/ethers automatically (updateether=yes)
- Build and place vm config file in /etc/xen/vm (createvmconf=yes)
Check you newly generated VM config file
At least at for first VM creation you should check generated config file, and see if is compliant with your site requirements. As everything is done automatically, it should be OK, but shiz.. happen !!! and I did not test on enough different configurations to garanty that it will work everyware and/or under every conditions.
fulup@logoden:~> cat /etc/xen/vm/xen-test (extract) fulup@logoden:~> grep xen-test /etc/ethers fulup@logoden:~> grep xen-test /etc/hosts |
You should at least check what my script does not check:
- Selected bidge exists and is the one you want to use for that given VM
- IP adress you gave for VM is siting within selected bridge network mask
- The name you gave was not already present in /etc/hosts in which case your "vmip" param has been overloaded automatically.
- Kernel+Initrd: script check files existence, but not if it is a valid one.
- Disk should be OK, except if you bypassed lvm/filesytem creation step.
Start your VM
If everything looks OK, you're ready to try your newly built VM, this with:
| fulup@logoden:~> xm create -c /etc/xen/vm/xen-test Processing config file "/etc/xen/vm/xen-test". Started domain xen-test Linux version 2.6.18.2-34-xen (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000010a00000 (usable) 0MB HIGHMEM available. 266MB LOWMEM available. .... Normal Linux Boot Process ......... Setting up network interfaces: lo IP address: 127.0.0.1/8 done eth0 (DHCP) . NET: Registered protocol family 17 IP/Netmask: 10.10.12.4 / 255.255.255.0 ('xen-test') done ............. Master Resource Control: runlevel 3 has been reached Skipped services in runlevel 3: irq_balancer nfs splash Welcome to openSUSE 10.2 (i586) - Kernel 2.6.18.2-34-xen (tty1). xen-test login: rootpasswd: (same password as on dom-0) |
This is typical Linux boot process, and it should run smoothly at least in console mode, if you have trouble it will probably come from your network config, this especially is you have many zones and mess up your config.
Distribution Dependencies
They're should not be any decencies to your preferred distribution and this script should work with any distribution. Worse case "tuneroot" will fail to understand your VM administration tree and you will have log by "xm console' to activate DHCP on the right virtual network interface.
I noted that Ubuntu is not shipped with domUloader.py as result "xm create" command cannot extract kernel directly from the VM at boot time and you have to store a local copy on VM kernel inside dom0, but generally in case of trouble it is easier to boot with a kernel from dom0. You can pre-extract kernel from virtual machine with Fridu-kernel-extract.script and then rebuilt xen config with Fridu-lvm.script.
| # Extract kernel from VM and place a copy on dom0 Fridu-kernel-extract.script vmname=xen-test/dev/XEN-LVM/vm-test2-root /tmp/fridu-root reiserfs rw 0 0/dev/XEN-LVM/xen-test-root /mnt ext3 rw,data=ordered 0 0/dev/XEN-LVM/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0----Sucess: Kernel & Initrd copied in /dev/XEN-LVM/xen-test-root----# Update /etc/xen/vm/xen-test config to boot with a kernel store on dom0 Fridu-lvm.script create vmname=xen-test fscreate=no installroot=no tuneroot=no bootfrom=dom0-dir xenkerneldir=/root/boot/xen-test |
How to delete a VM
Nothing more simple, stop your VM and remove attached logical volume
| # Verify your Logical Volumes name for this VM 1269 # lvscan | grep xen-test ACTIVE '/dev/SATA-160GA/xen-test-root' [5,00 GB] inherit ACTIVE '/dev/SATA-160GA/xen-test-swap' [5,00 GB] inherit # Remove Logical Volume (WARNING: this delete obviously everything)1272 # lvremove /dev/SATA-160GA/xen-test-* Do you really want to remove active logical volume "xen-test-root"? [y/n]: y Logical volume "xen-test-root" successfully removed Do you really want to remove active logical volume "xen-test-swap"? [y/n]: y Logical volume "xen-test-swap" successfully removed |
How to Transfert a VM to an other host/environement.
Stop your VM, mount in Dom-0 root logical volume and send your backup.tgz to where ever you have to, at destination rerun Freidu-in-Xen create script on your received root backup.tgz.
| # stop your VM 1251 # xm shutdown xen-test (or beter log root and halt it) # mount your VM inside Dom-0 1252 # mount /dev/SATA-160GA/xen-test-root /mnt/ # backup your VM root WARNING: must do backup from VM root or create script won't works at destination 1253 # cd /mnt 1257 # tar -czf /tmp/xen-test-backup.tgz . tar: ./var/run/zmd/zmd-remoting.socket: porte (socket) ignorée tar: ./var/run/zmd/zmd-web.socket: porte (socket) ignorée .....ignore error on socket files # unmount your VM root logical volume 1257# cd /tmp1258 # umount /mnt |
Hoop !!! it's not working
They are obvisouly many way to mess up. First I do not pretend that my script is perfect, it is build in bash and is very far from checking everthing it could. Secondly you're not perfect either and you may have done something wrong :)
First class of errors will break script at creation time. Potential errors are almost unlimtted and range from "not enough space" to "already used IP address", "distribution not found", ... In this case you typically restart creation script changing one parameter and bypassing previsouly succeded step. Here after few sample of typical errors.
| # trying to rebuild an existing Logical Volume1004 # Fridu-lvm.script create vmname=xen-test Logical volume "xen-test-root" already exists in volume group "SATA-160GA" ERROR: fail to create volume group /dev/SATA-160GA/xen-test-root FATAL fail to create logical volume group [must be fixed] Note: create your root+swap LV manually and use 'lvmcreate=no' option # trying to installed on an already formated Logical Volume 1006 # Fridu-lvm.script create vmname=xen-test lvmcreate=no /dev/SATA-160GA/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0 ERROR /dev/SATA-160GA/xen-test-root already formated user 'fscreate=force' FATAL fail to create root and/or swap filesystem [must be fixed] Note: create rootfs/swapfs or force reformating with fscreate=no|force' option # trying to install on an existing distribution (/etc exist in Logical Volume)1008 # Fridu-lvm.script create vmname=xen-test lvmcreate=no fscreate=no /dev/SATA-160GA/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0 FATAL: Logical dev=/dev/SATA-160GA/xen-test-root contains a distribution Note: Bypass root install or force reinstallation with option installroot=no|force # trying to install with an already affected ip address (same ip for a different vmname in /etc/hosts)1011 # Fridu-lvm.script create vmname=xen-test vmip=10.10.12.1 /dev/SATA-160GA/xen-test-root /tmp/fridu-mnt ext3 rw,data=ordered 0 0 ERROR: IP=10.10.12.1 already used by xen-br0 in /etc/hosts Note: restart with a different vmip=xx.xx.xx.xx or fix /etc/hosts+ethers Use: ./Fridu-Xen-lvm.script vmname=xen-test vmimp=xx.xx.xx.xx lvmcreate=no fscreate=no installroot=no tuneroot=no # forgeting vm ip address, when vmname is not present in /etc/hosts 1013 # Fridu-lvm.script create vmname=xen-test lvmcreate=no fscreate=no installroot=no -- missing vmip FATAL madatory comamnd line option missing Bug: even if a value is not used you MUST provide a dummy value ********** Retry ./Fridu-Xen-lvm.script vmip=xxxx |
Second class of errors: create command fly, but "xm create $vmname" fails. In almost every cases this is coming from your newly generated VM config file. You need to check "/etc/xen/vm/$vmame" and try to understand what went wrong. You may have an invalide config for example in your disk definition, this is especially classical when you generate a config on an existing distribution bypassing previous installation steps.
Third class or error: Your VM start, you can connect onto it with "xm console" but it does not do what you want. In most cases problems are coming from network, but you nevertheless can connect from "xm console", in rarer situations kernel will fail to boot.
- Cannot find mandatory device. After a relatively long wait, you arrive on a "mini shell". What's happen is that your kernel failed to mount your root and you're sitting in the initrd ramdisk. This is probably coming from in invalid "root" value in one of your boot attributes, but it may also come come from a wrong initrd. A this level you have almost no commands but you still can leverage "/sys" pseudo filesystem to browse around. For example "ls /sys/block" will allow you to check which disk are visible, then make sure that "root=/dev/xxx" inside vmconfig is pointing to the good one.
- Kernel Panic, a little more complex to handle has you do not have any running system. In most of the case this is coming from a wrong Initrd, a typical example is trying to mount a reiserfs formated logical group with an inirtd that only contains ext3. The best option is to look messages, and eventually to mount from Dom-O your VM LV and look from inside what could be wrong.
- Your VM start, you can connect from the console, but your network is not working. This is for sure the most typical case. The first thing is to check basic network infrastructure environment: is your DNS running ? does it listen on Dom-0 bridge port? is your VM IP within bridge IP range ?, ... If you do not find any obvious error then the best is to use "tcpdump" in conjuction with "dhcpcd-test" and ping. You connect on the console to force a request and you check from Dom-O packets coming from your VM. When this is done you need then to run tcpdump inside the VM to see if packet are passing from Dom-0 to your VM.
# From VM: Force DHCP requests Xen-test:~# dhcpcd-test eth0dhcpcd: MAC address = aa:bb:cc:dd:80:11 (dhcpcd will lock at this point) # From Dom-0: check that packet come from VMDom-0:~ # tcpdump -i xen-br0 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on xen-br0, link-type EN10MB (Ethernet), capture size 96 bytes 01:25:24.441978 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from xen-test, length: 01:25:24.563211 IP xen-br0.bootps > xen-test.bootpc: BOOTP/DHCP, Reply, length: 322 01:25:24.563572 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from xen-test, length: 01:25:24.508451 IP xen-br0.bootps > xen-test.bootpc: BOOTP/DHCP, Reply, length: 32 # Alternatively you can set VM hard assign VM IP from inside and ping Xen-test:~# ifconfig eth0 10.10.21.4 netmask 255.255.255.0 Xen-test:~# ping 10.10.12.1 (try to ping my local dom-0 bridge) # Check from Dom-0 directly on VM VIF interfaceDom-0:~ # ifconfig xen-test Dom-0:~ # tcpdump -i xen-test Note:
|
Creating of your Firewall and Security zones
Distribution is shipped with tree samples, what ever is your target, I advice you to do is to draw your target network, and to build rule from your graphic.
Let's start with a simple case. We only have one zone, with one external IP and two Xen virtual machines. Access to virtual machines is done by remapping ssh port from external to internal, and each VM should have a full access to Internet.
- ssh -p 22 root@MyInternetIP (connect dom0)
- ssh -p 23 root@MyInternetIP (connect Xen-One )
- ssh -p 24 root@MyInternetIP (connect Xen-Two)
| ----- Internet -------------- | Your ISP FIX Internet IP (dom0 10.10.11.5) | --------------------------------- (zone-0) (Future zone) xen-br0(10.0.12.1) +---------------------------+ | | 10.10.12.4 10.10.12.5 Xen-One Xen-Two |
We only one one zone, each zone is defined with
- Name (what ever you want until is it unique)
- Network Interface, this is use to allow incomming traffic
- External IP address, that is used to allow incomming traffic and to NAT outgoing one.
- Bridge, which define the entry point for the given zone
- Zone IP address and the network mark that will contain all VM IPs
We have three applications, each of them is defined with
- Name (what ever you want until is it unique)
- Zone name the application site in,as defined previsoully . Note that "none" is a special zone pointing on "Dom0"
- External port/proto Internet user will see. Note that external interface and IP is inherited from zone definition.
- Internal VM target IP and port in case we remap the application.
| # My first simple sample of Firewall # --------------------------------------------------- CreateZone NAME=ZA NIC=eth0 EXT=MyExternalIP BR=xen-br0 INT=10.10.12.1 MASK=255.255.255.0CreateApp NAME=DOM_SSH ZONE=none EXT=tcp:22 INT=eth0 ;# special builtin for Dom0CreateApp NAME=VM1_SSH ZONE=ZA EXT=tcp:23 INT=10.10.12.4:22CreateApp NAME=VM2_SSH ZONE=ZA EXT=tcp:24 INT=10.10.12.5:22 |
An avanced sample, this is more or less the one we have for Fridu in production.
- One Network Interface
- Two Internet IP
- One VPN for admin and Internal services.
- Two zones (one per admin), First one support two VMs while second one only hold one.
- VM-Common services(LDAP, backup, IMAP, ...)
- VM-Fulup (WebMail, Photos, Video Streaming, ...)
- VM-Domi (Joomla)
- VMs can see each other within and outside their containing zone.
- VMs can talk route to any VPN endpoint and vice versa
| Admin Users Internet Users (10.10.95.40) | [VPN] [Operater Network] ----------------------------------------------------------------------------------------------------------- - internet ------------------------------------------------------------------------------------------------------------ [VPN] [Nat+Dnat] [Nat+Dnat] | | | (tun0 10.10.95.38) (eth0=91.121.65.140) (eth0:fail=87.98.130.56) Admin Network Fridu Fix IPs at OVH | | ---------+--------------Routing----------------+------------- | | (snat:91.121.65.140) (snat:87.98.130.56) | | (zFulup) (zDomi) xen-br2(10.0.1.1) xen-br3(10.10.2.1) +---------------------+ +---------------------------+ | | | 10.10.1.2 10.10.12.5 10.10.2.2 Common WEB-Ful WEB-Dom |
In this example we have two zones and, while they are both mapped on the same physical network interface (eth0) they are using different external IP adresses.
zFulup is mapped onto 91.121.65.140
- Accept external traffic from port 25 and map it locally onto vm(Common) port 2525
- Accept external traffic from 80 and map it onto vm(Web-Fulup) port 8080
- Accept unlimitted internal traffic
zDomi is mapped onto 87.98.130.56
- Accept external traffic from 80 and map it onto vm(Web-Fulup) port 8080
- Accept external traffic from22 and map it onto 22(ssh)
Dom0 (none zone) accept port 22(ssh) and openvpn(44096+563)
- no special restriction on zone
- VPN(tun0) can talk to any VMs and vice versa.
- VMs can talk inside and outside their own zone.
| # Zones definition # ----------------- IP_ONE=91.121.65.140 IP_TWO=87.98.130.56 CreateZone NAME=zFulup NIC=eth0 EXT=$IP_ONE BR=xen-br1 INT=10.10.1.1 MASK=255.255.255.0 CreateZone NAME=zDominig NIC=eth0 EXT=$IP_TWO BR=xen-br2 INT=10.10.2.1 MASK=255.255.255.0 # Application Port & Forwarding (default ACCEPT = none) # ------------------------------------------------------ CreateApp NAME=DOM0_SSH ZONE=none EXT=tcp:22 INT=eth0 ;# special builtin for Dom0 CreateApp NAME=DOM0_VPNt ZONE=none EXT=tcp:563 INT=eth0 ;# OpenVPN in TCP CreateApp NAME=DOM0_VPNu ZONE=none EXT=udp:44096 INT=eth0 ;# check Custom Rules later # incomming mail and web are redirected to 8080 in order to allow different vpn mappping CreateApp NAME=FulupSMTP ZONE=zFulup EXT=tcp:25 INT=10.10.1.2:2525 CreateApp NAME=FulupWEB ZONE=zFulup EXT=tcp:80 INT=10.10.1.3:8080 CreateApp NAME=DomiSMTP ZONE=zDominig EXT=tcp:25 INT=10.10.1.2:2525 CreateApp NAME=DomiWEB ZONE=zDominig EXT=tcp:80 INT=10.10.2.2:8080 CreateApp NAME=DomiSSH ZONE=zDominig EXT=tcp:22 INT=10.10.2.2:22 # Zone Restriction (default input/out ACCEPT all) # ----------------------------------------------- # TuneZone NAME=BR2_NOSMPT ZONE=ZA DIR=out ACTION=DROP PORT=tcp:25# TuneZone NAME=BR3_NOSMPT ZONE=ZA DIR=out ACTION=DROP PORT=tcp:25 # User Before/After Zone Custom Tables (before-input|output|forwarding, after-input|...) # ---------------------------------------------------------------------------------- if test "$ACTION" = "start" ; then DoIt iptables -A after-forwarding -i xen-br+ -o xen-br+ -j ACCEPT # allow VM to talk together DoIt iptables -A after-forwarding -i tun+ -o xen-br+ -j ACCEPT # allow VPN talk to zones DoIt iptables -A after-input -i tun+ -j ACCEPT # allow VPN talk to dom0 DoIt iptables -A after-forwarding -i xen-br+ -o tun+ -j ACCEPT # allow Zones talk to VPN fi |
Guru sample, even if you do not need it today it is important to make sure that you architecture can scale to support your furtur requirements. Next case has been tested in my lab and while today I have no requirement and no money for such a configuration, and wanted to make sure it was possible.
- Tree zones two mapped on internet with a dedicated NIC, one isolated from Internet.
- Dom0 support SSH and OpenVPN.
- First zone as tree VM and second two.
- First zone handle mail, web and SIP
- Second zone only web.
- A tuning rule prevent second zone from sending mail.
- Custom rules allow unlitted VPN traffic and inter zone communication.
| Admin Users Internet Users (10.10.95.40) | [VPN] [Operater Network] ------------------------------------------------------------------------------------------------ - internet ------------------------------------------------------------------------------------------------ [VPN] [Nat+Dnat] [Nat+Dnat] | | | (tun0 10.10.95.38) (eth0=91.121.65.140 (eth0:fail=87.98.130.56) Admin Network Fridu Fix IPs at OVH | | ---------+--------------Routing----------------------------------------------------------------+------------- | | | (no Internet) (snat:91.121.65.140) (snat:87.98.130.56) | | | (zAdmin) (zFulup) (zDomi) xen-br1(10.10.1.1) xen-br2(10.0.2.1) xen-br3(10.10.3.1) +-------+ +---------------------------------------+ +--------------------+ | | | | | | 10.10.1.2 10.10.1.3 10.10.12.4 10.10.12.5 10.10.13.2 10.10.13.3 VM-admin SIP Mail WEB-Ful WEB-Dom Brendan |
Firewall rules for this sample are defined here after. You may note that the Internet isolated zone does not have any applications defined. This is logical as applications only define autorization for external traffic, we could nevertheless have a special custom rule to prevent outgoing traffic to the external word as well. Check tune rule for zDominig zone that prevent every VM from this zone to send mail, this both externally and internally.
| # Zones definition # ----------------- IP_ONE=91.121.65.140 IP_TWO=87.98.130.56 CreateZone NAME=zAdmin NIC=none EXT=none BR=xen-br1 INT=10.10.1.1 MASK=255.255.255.0 CreateZone NAME=zFulup NIC=eth0 EXT=$IP_ONE BR=xen-br2 INT=10.10.2.1 MASK=255.255.255.0 CreateZone NAME=zDominig NIC=eth0 EXT=$IP_TWO BR=xen-br3 INT=10.10.3.1 MASK=255.255.255.0 # Application Port & Forwarding (default ACCEPT = none) # ------------------------------------------------------ CreateApp NAME=DOM0_SSH ZONE=none EXT=tcp:22 INT=eth0 ;# special builtin for Dom0 CreateApp NAME=DOM0_VPN ZONE=none EXT=udp:44096 INT=eth0 ;# check User Custom Rules later CreateApp NAME=VMA1_SIP ZONE=zFulup EXT=udp:5600 INT=10.10.12.3:5600 CreateApp NAME=VMA1_SMTP ZONE=zFulup EXT=tcp:25 INT=10.10.12.4:25 CreateApp NAME=VMA1_WEB ZONE=zFulup EXT=tcp:80 INT=10.10.12.5:8080 CreateApp NAME=VMB1_WEB1 ZONE=zDominig EXT=tcp:80 INT=10.10.13.2:8080 CreateApp NAME=VMB1_WEB2 ZONE=zDominig EXT=tcp:8080 INT=10.10.13.3:8080 # Zone Restriction (default input/out ACCEPT all) # ----------------------------------------------- TuneZone NAME=BR3_NOSMPT ZONE=zDominig DIR=out ACTION=DROP PORT=tcp:25 # Prevent ZB to send mail # User Before/After Zone Custom Tables (before-input|output|forwarding, after-input|...) # ---------------------------------------------------------------------------------- if test "$ACTION" = "start" ; then DoIt iptables -A after-forwarding -i xen-br+ -o xen-br+ -j ACCEPT # allow VM to talk together DoIt iptables -A after-forwarding -i tun+ -o xen-br+ -j ACCEPT # allow VPN talk to zones DoIt iptables -A after-input -i tun+ -j ACCEPT # allow VPN talk to dom0 DoIt iptables -A after-forwarding -i xen-br+ -o tun+ -j ACCEPT # allow Zones talk to VPN fi |
If you find bugs, or have done improvement please let me know.






Just wanted to point out that the two sets of 'Prev - Next' links at the footer of the page are slightly confusing. Maybe the bottom ones should be called 'Older/Newer Article'
BTW, this is a great resource. I will try to understand it and apply it in a similar scenario.
=========> Fulup respond ==============
I agree that having two next/prev link on the same page is confusing :( I kept Joomla default default config, which was not a good idea.
Conclusion: I removed the article/article navigation and kept only the on to browse current article.
Thank you for the TIP.