March 2010
M T W T F S S
« Feb    
1234567
891011121314
15161718192021
22232425262728
293031  
97

Categories

Archives

Keeping CentOS 5 OpenVZ images up to Date with Yum

Now that I’ve been using OpenVZ for several months I’d gotten to the point where I wanted/needed to “yum update” all my VEs. I currently have 11 images running on my OpenVZ Server. I thought I could just vzctl exec … yum -y update all the VEs, but quickly ran into some issues with this brute force approach. Doing the yum -y update broke several of my VEs so I opted to restore the unrecoverable ones from backups.

For my second attempt, I opted to do each VE independently, to get a better understanding of what the best approach would be for doing mass upgrades like this now, and in the future. The first hurdle to overcome had to do with some of the VEs running out of memory (RAM) during the upgrade process.

NOTE: A VE = Virtual Environment (aka. a virtual host), while HN = Host Node

Issue #1, not enough RAM for yum to run

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# RAM budgets for VEs
 
# NOTE: values are in # of pages (1 pg. = 4K)
 
% (printf "vm feature held maxheld barrier limit failcnt\n"; grep privvm /proc/bc/1*/resources)|column -t
vm                       feature      held    maxheld  barrier  limit   failcnt
/proc/bc/101/resources:  privvmpages  31904   82215    65536    69632   2
/proc/bc/102/resources:  privvmpages  119803  196620   166400   179200  9517
/proc/bc/103/resources:  privvmpages  27125   35974    65536    69632   0
/proc/bc/104/resources:  privvmpages  56251   107250   104960   115200  0
/proc/bc/105/resources:  privvmpages  73559   82926    98304    103304  0
/proc/bc/106/resources:  privvmpages  30219   68097    65536    69632   0
/proc/bc/108/resources:  privvmpages  30081   84291    65536    69632   1
/proc/bc/109/resources:  privvmpages  32790   74199    98304    103304  0
/proc/bc/110/resources:  privvmpages  40497   69408    65536    69632   1
/proc/bc/111/resources:  privvmpages  26990   35371    65536    67840   0
 
# NOTE: converted the columns to megabytes (MB), it's just easier to read
 
(printf "vm feature held maxheld barrier limit failcnt\n"; grep privvm /proc/bc/1*/resources|awk '{sub($3,$3*4096/2^20) sub($4,$4*4096/2^20) sub($5,$5*4096/2^20) sub($6,$6*4096/2^20)}1')|column -t
vm                       feature      held     maxheld  barrier  limit    failcnt
/proc/bc/101/resources:  privvmpages  124.625  321.152  256      272      2
/proc/bc/102/resources:  privvmpages  467.98   768.047  650      700      9517
/proc/bc/103/resources:  privvmpages  105.957  140.523  256      272      0
/proc/bc/104/resources:  privvmpages  219.73   418.945  410      450      0
/proc/bc/105/resources:  privvmpages  287.34   323.93   384      403.531  0
/proc/bc/106/resources:  privvmpages  118.043  266.004  256      272      0
/proc/bc/108/resources:  privvmpages  117.504  329.262  256      272      1
/proc/bc/109/resources:  privvmpages  128.086  289.84   384      403.531  0
/proc/bc/110/resources:  privvmpages  158.191  271.125  256      272      1
/proc/bc/111/resources:  privvmpages  105.43   138.168  256      265      0

According to this data 4 of the 11 images had gone over their allocation of RAM. So I tried restarting these and re-running yum update within the problem VEs. Again the update failed and so I needed to increase their allocation of memory. I didn’t want to devote more memory permanently, just a enough temporarily to do the upgrade. So I used this trick to temporarily bump up a VEs allocated memory.

1
2
3
4
5
6
7
8
9
# increase the RAM by 100MB
vzctl set 101 --privvmpages $((256+100))m:$((272+100))m --save
 
# ...
# do the upgrade (yum update)
# ...
 
# decrease the RAM back to the original value
vzctl set 101 --privvmpages 256m:272m --save

For the remaining 3 VEs that needed additional memory I used these commands to increase their allocations of RAM

1
2
3
4
5
6
7
8
9
# cmds. to increase RAM by 100MB
vzctl set 102 --privvmpages $((650+100))m:$((700+100))m --save
vzctl set 108 --privvmpages $((256+100))m:$((272+100))m --save
vzctl set 110 --privvmpages $((256+100))m:$((272+100))m --save
 
# cmds. to decrease
vzctl set 102 --privvmpages 650m:700m --save
vzctl set 108 --privvmpages 256m:272m --save
vzctl set 110 --privvmpages 256m:272m --save

Issue #2, not enough diskspace for yum to run

The next snag I ran into had to do with a couple of the VEs running out of diskspace. And here’s the commands I used to reconfigure more diskspace.


…. Continue reading → Keeping CentOS 5 OpenVZ images up to Date with Yum »»

Managing OpenVZ Instance Descriptions

I recently discovered that you can assign descriptions to your OpenVZ instances. This isn’t really that surprising but I never really took the the time until now to scan through the vzctl and vzlist man pages. Prior to discovering these more esoteric features of vzctl and vzlist, I’d normally just run this command to see what’s what with my OpenVZ instances:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# list of all VE instances
% vzlist -a
      CTID      NPROC STATUS  IP_ADDR         HOSTNAME                        
       101         32 running 10.1.1.101      flanders.mydom.net              
       102         49 running 10.1.1.102      lisa.mydom.net                  
       103         36 running -               bart.mydom.net                  
       104         38 running 10.1.1.104      marge.mydom.net                 
       105         34 running 10.1.1.105      homer.mydom.net                 
       106         31 running 10.1.1.106      kang.mydom.net                  
       107         27 running 10.1.1.107      kodos.mydom.net                 
       108         32 running 10.1.1.108      maude.mydom.net                 
       109         30 running 10.1.1.109      nelson.mydom.net                
       110         32 running 10.1.1.110      ralphie.mydom.net               
       111         30 running 10.1.1.111      martin.mydom.net

This had worked fine, but sometimes I’d draw a blank about what’s running in each instance, hence my need for the description column. You can use the following command to see the description of each instance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# initial description columns for VEs
% vzlist -o ctid,numproc,status,ip,hostname,description
      CTID      NPROC STATUS  IP_ADDR         HOSTNAME                         DESCRIPTION                     
       101         34 running 10.1.1.101      flanders.mydom.net               -                               
       102         49 running 10.1.1.102      lisa.mydom.net                   -                               
       103         35 running -               bart.mydom.net                   -                               
       104         38 running 10.1.1.104      marge.mydom.net                  -                               
       105         34 running 10.1.1.105      homer.mydom.net                  -                               
       106         31 running 10.1.1.106      kang.mydom.net                   -                               
       107         27 running 10.1.1.107      kodos.mydom.net                  -                               
       108         32 running 10.1.1.108      maude.mydom.net                  -                               
       109         30 running 10.1.1.109      nelson.mydom.net                 -                               
       110         32 running 10.1.1.110      ralphie.mydom.net                -                               
       111         30 running 10.1.1.111      martin.mydom.net                 -

I used the following commands to set the description for each VE.

1
2
3
4
5
6
7
8
9
10
11
vzctl set 101 --description "nis server" --save
vzctl set 102 --description "mail,imap,smtp server" --save
vzctl set 103 --description "samba server" --save
vzctl set 104 --description "mysql server" --save
vzctl set 105 --description "www,blog,webmail server" --save
vzctl set 106 --description "trac,git,svn server" --save
vzctl set 107 --description "trac,git,svn server (dev)" --save
vzctl set 108 --description "ldap server (dev)" --save
vzctl set 109 --description "cacti server" --save
vzctl set 110 --description "tracks server" --save
vzctl set 111 --description "wiki server" --save

Re-running the vzlist command from before now shows the newly added descriptions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# filled out description columns for VEs
% vzlist -o ctid,numproc,status,ip,hostname,description
      CTID      NPROC STATUS  IP_ADDR         HOSTNAME                         DESCRIPTION                     
       101         32 running 10.1.1.101      flanders.mydom.net               nis server                      
       102         49 running 10.1.1.102      lisa.mydom.net                   mail,imap,smtp server           
       103         36 running -               bart.mydom.net                   samba server                    
       104         38 running 10.1.1.104      marge.mydom.net                  mysql server                    
       105         34 running 10.1.1.105      homer.mydom.net                  www,blog,webmail server         
       106         31 running 10.1.1.106      kang.mydom.net                   trac,git,svn server             
       107         27 running 10.1.1.107      kodos.mydom.net                  trac,git,svn server (dev)       
       108         32 running 10.1.1.108      maude.mydom.net                  ldap server (dev)               
       109         30 running 10.1.1.109      nelson.mydom.net                 cacti server                    
       110         32 running 10.1.1.110      ralphie.mydom.net                tracks server                   
       111         30 running 10.1.1.111      martin.mydom.net                 wiki server

Troubleshooting a Restore of an OpenVZ Image from one Host Node to Another

I hadn’t tried this before but I figured I could use vzdump to save a Virtual Environment aka. (VE) from one Host Node aka. (HN) and restore it to another Host Node. It turns out that you can do this but there is one gotcha that I wasted couple of hours on, so this post is my attempt to hopefully save someone else a couple of hours and also as a beacon for myself the next time I run into this and forget how to work around it.

Firstly I have 2 OpenVZ servers. One has 4 GB of RAM and the 2nd has 1 GB. I started off using vzdump to create a backup of one of my VEs running on the 4 GB HN. I then restored the backed up VE using vzdump on the 1 GB HN. Everything went smoothly up to this point, until I ran the vzlist -a command on the 1 GB HN and was presented with this message:

1
2
3
4
5
6
% vzlist -a
Warning: too large value for PHYSPAGES=0:9223372036854775807 was truncated
Warning: too large value for VMGUARPAGES=33792:9223372036854775807 was truncated
Warning: too large value for OOMGUARPAGES=26112:9223372036854775807 was truncated
      CTID      NPROC STATUS  IP_ADDR         HOSTNAME                        
       201          - stopped 192.168.1.201   flanders.bubba.net

I had never seen this message before and it appeared to be benign in terms of the VE being able to start/stop, but messages like this really get under my skin, so of course I had to burn a few hours to figure out why. There wasn’t really much in the way of blog posts or forum posts either, which kind of surprised me.

To start I researched what these 3 parameters actually controlled.

PARAMETERTypeDefinition
oomguarpages(system)The guaranteed amount of memory for the case the memory is “over-booked” (out-of-memory kill guarantee).
vmguarpages(system)Memory allocation guarantee.
physpages(system)Total number of RAM pages used by processes.

I then poked around and noticed that the /etc/vz/conf/0.conf file on the 1 GB HN looked like this:

1
2
3
4
5
6
7
8
9
10
% more /etc/vz/conf/0.conf 
# This is configuration file for VE0.
# Only UB parameters are processed
#
# Copyright (C) 2006-2008, Parallels, Inc. All rights reserved.
 
ONBOOT="no"
 
# UBC parameters (in form of barrier:limit)
OOMGUARPAGES="2147483647:2147483647"

While on the 4 GB HN system it looked like this:

1
2
3
4
5
6
7
8
9
10
% more /etc/vz/conf/0.conf 
# This is configuration file for VE0.
# Only UB parameters are processed
#
# Copyright (C) 2006-2008, Parallels, Inc. All rights reserved.
 
ONBOOT="no"
 
# UBC parameters (in form of barrier:limit)
OOMGUARPAGES="9223372036854775807:9223372036854775807"

It then dawned on me that the barrier portion of these parameters was the problem. The barrier is the number that comes after the colon in these lines:

1
2
3
Warning: too large value for PHYSPAGES=0:9223372036854775807 was truncated
Warning: too large value for VMGUARPAGES=33792:9223372036854775807 was truncated
Warning: too large value for OOMGUARPAGES=26112:9223372036854775807 was truncated

So a quick change of all the values from 9223372036854775807 to 2147483647 fixed the problem.

This post doesn’t even scratch the surface of all the nuts and bolts of memory management ins and outs within OpenVZ. For more info about OpenVZ memory management you might want to check out these useful links:

Howto Stop Clock Drift Issues on a CentOS 5 OpenVZ Host Node

I’ve been dealing with a nagging problem for several months. The problem? I can’t get my OpenVZ Host Node, aka. HN, to keep consistent time. I’ve even been running ntpd, and the time would still invariably drift so that the HN and the Virtual Environments, aka. VEs, running on the HN were all several hours behind. Simply restarting ntpd would temporarily fix the situation, but I shouldn’t have to do that.

So I finally hunkered down and figured out how to solve this problem. These 2 threads proved extremely helpful in determining the fix.

These threads offered 2 fixes to try. The first, was to modify the ntp.conf file so that any server definitions now included the burst & iburst switches like this:

1
2
3
4
# /etc/ntp.conf
...
server 192.168.1.6 burst iburst
...

Adding these switches seemed to help, but then my /var/log/messages log started getting filled with synch attempts by ntpd like these:

1
2
3
4
5
6
7
8
9
10
11
12
Jul 10 03:50:18 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 03:51:24 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3
Jul 10 03:59:59 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 04:01:21 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3
Jul 10 04:11:52 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 04:15:05 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3
Jul 10 04:27:56 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 04:30:06 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3
Jul 10 04:41:53 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 04:47:17 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3
Jul 10 05:03:28 mulder ntpd[15032]: synchronized to LOCAL(0), stratum 10
Jul 10 05:12:05 mulder ntpd[15032]: synchronized to 192.168.1.6, stratum 3

The second suggestion was the key tip. It recommended that you add a kernel switch called clock=pmtmr to the grub.conf file. Once I tried this I received a warning message in the dmesg log stating that this switch had been deprecated, and the new switch was now clocksource=acpi_pm.

1
2
3
4
5
% dmesg | grep -i clock
Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet clock=pmtmr
Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet clock=pmtmr
Warning: clock=pmtmr is deprecated. Use clocksource=acpi_pm.
Real Time Clock Driver v1.12ac

So I modified the grub.conf to incorporate the clocksource switch, rebooted the system, and was rewarded with a cleaner dmesg.

1
2
3
4
% dmesg | grep -i clock
Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet clocksource=acpi_pm
Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet clocksource=acpi_pm
Real Time Clock Driver v1.12ac

A quick check of /var/log/messages looked good, no more ntpd synch messages:

1
2
3
4
5
6
7
8
9
10
11
Jul 11 00:30:49 mulder ntpd[24578]: ntpd 4.2.2p1@1.1570-o Thu Apr  9 12:53:31 UTC 2009 (1)
Jul 11 00:30:49 mulder ntpd[24579]: precision = 1.000 usec
Jul 11 00:30:49 mulder ntpd[24579]: Listening on interface wildcard, 0.0.0.0#123 Disabled
Jul 11 00:30:49 mulder ntpd[24579]: Listening on interface wildcard, ::#123 Disabled
Jul 11 00:30:49 mulder ntpd[24579]: Listening on interface lo, ::1#123 Enabled
Jul 11 00:30:49 mulder ntpd[24579]: Listening on interface veth103.0, fe80::218:51ff:fe43:8487#123 Enabled
Jul 11 00:30:49 mulder ntpd[24579]: Listening on interface vmbr0, fe80::222:15ff:fe91:c12d#123 Enabled
Jul 11 00:30:55 mulder ntpd[24579]: kernel time sync status 0040
Jul 11 00:30:56 mulder ntpd[24579]: frequency initialized 25.118 PPM from /var/lib/ntp/drift
Jul 11 00:42:51 mulder ntpd[24579]: synchronized to LOCAL(0), stratum 10
Jul 11 00:43:54 mulder ntpd[24579]: synchronized to 192.168.1.6, stratum 3

Just to be on the safe side, I opted to leave the burst & iburst switches on the server line in the /etc/ntp.conf file. For completeness, I researched what the burst/iburst switches do. Here are those descriptions right out of the ntp.conf man page:

burst – When the server is reachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first and second packets can be changed with the calldelay command to allow additional time for a modem or ISDN call to complete. This option is valid with only the server command and is a recommended option with this command when the maxpoll option is 11 or greater.

iburst – When the server is unreachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first and second packets can be changed with the calldelay command to allow additional time for a modem or ISDN call to complete. This option is valid with only the server command and is a recommended option with this command.

After waiting for ~2 hours, I still did not see anymore of the ntp synch messages in /var/log/messages. Since the burst/iburst switches don’t seem to be causing any additional problems I’m going to leave them on going forward, and I may remove them completely at a later date. For now, I’m going to wait and make sure that the time remains stable before I do that.

More Info About OpenVZ and Memory

Found this extremely useful post over on maxgarrick.com about how OpenVZ resource limits work. I’m re-posting a very handy diagram that was part of the blog post so that I have a local copy for my own references here:

vz

I highly recommend that you check out the post if you want a great description of the resource limits with respect to memory work with OpenVZ.

Backing Up OpenVZ

Now that I’ve been using OpenVZ for several months I thought it would be a good time to devote a few cycles to putting together a backup routine for the 10+ OpenVZ VEs that I’ve setup. Here is the procedure I put together to backup one of my VEs. I should mention that all my VEs make use of automounts, and through experimentation I figured out that I need to stop the automounter while I’m performing a backup of a VE. From my HN I determined that this was the minimal number of commands I could accomplish a backup of a single VE.

1
2
3
4
vzctl exec 101 /etc/init.d/autofs stop			# stop automounter
vzdump --compress --suspend 101				# backup VE(Virtual Environment) 101
vzctl exec 101 /etc/init.d/autofs start			# start automounter
mv vzdump-101.log vzdump-101.tgz /mnt/vz_backups/	# move backed up file to on-line storage area

Here is the full transcript from a backup

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
% vzctl exec 101 /etc/init.d/autofs stop
Stopping automount: [  OK  ]
 
% vzdump --compress --suspend 101
INFO: Starting new backup job - vzdump --compress --suspend 101
INFO: Starting Backup of VM 101 (openvz)
INFO: status = CTID 101 exist mounted running
INFO: starting first sync /vz/private/101 to /var/tmp/vzdumptmp22651
INFO: Number of files: 35178
INFO: Number of files transferred: 25940
INFO: Total file size: 800591555 bytes
INFO: Total transferred file size: 616512883 bytes
INFO: Literal data: 616512883 bytes
INFO: Matched data: 0 bytes
INFO: File list size: 712364
INFO: File list generation time: 0.001 seconds
INFO: File list transfer time: 0.000 seconds
INFO: Total bytes sent: 618585522
INFO: Total bytes received: 679718
INFO: sent 618585522 bytes  received 679718 bytes  10069353.50 bytes/sec
INFO: total size is 800591555  speedup is 1.29
INFO: first sync finished (61 seconds)
INFO: suspend vps
INFO: Setting up checkpoint...
INFO:   suspend...
INFO:   get context...
INFO: Checkpointing completed succesfully
INFO: final sync /vz/private/101 to /var/tmp/vzdumptmp22651
INFO: Number of files: 35178
INFO: Number of files transferred: 0
INFO: Total file size: 800591555 bytes
INFO: Total transferred file size: 0 bytes
INFO: Literal data: 0 bytes
INFO: Matched data: 0 bytes
INFO: File list size: 712364
INFO: File list generation time: 0.001 seconds
INFO: File list transfer time: 0.000 seconds
INFO: Total bytes sent: 716728
INFO: Total bytes received: 4363
INFO: sent 716728 bytes  received 4363 bytes  480727.33 bytes/sec
INFO: total size is 800591555  speedup is 1110.25
INFO: final sync finished (1 seconds)
INFO: resume vps
INFO: Resuming...
INFO: vps is online again after 3 seconds
INFO: creating archive '/vz/dump/vzdump-101.dat' (/var/tmp/vzdumptmp22651/101)
INFO: Total bytes written: 641587200 (612MiB, 11MiB/s)
INFO: file size 201MB
INFO: Finished Backup of VM 101 (00:02:02)
 
% vzctl exec 101 /etc/init.d/autofs start
Starting automount: [  OK  ]
 
% mv vzdump-101.log vzdump-101.tgz /mnt/vz_backups/.

I put together this one-liner which will handle the backing up of all 11 of my VEs.

1
2
for i in `seq 101 111`;do vzctl exec $i /etc/init.d/autofs stop; vzdump --compress --suspend $i; vzctl exec $i /etc/init.d/autofs start;done
mv vzdump-*.tgz /mnt/vz_backups/.
Page 1 of 3123