Welcome to www.freeoraclehelp.com Got questions? Post comments Like the facebook page to get instant updates. Thank you!!!
Total Pageviews

Tuning Oracle VM for Oracle 10g / 11g RAC

Best practices to run Oracle 10g/11g RAC on Oracle VM 2.2 are posted in this article.  Virtualization is one of the hot topics in Oracle world too. Now that they are multiple virtual solutions, the question of comparison obviously raises.  Out of VMware Server, VMware ESXi (Hypervisor), Oracle Virtual Box, and Oracle VM, I found Oracle VM suites best for my requirements (performance is key for me).  I am pretty impressed with Oracle VM Performance.   Comparing or benchmarking performance of these virtual solutions isn’t my topic now.

I have observed a few problems with node evictions on Oracle VM Guests. So, it took some tuning in Oracle VM to get the stability with RAC and avoid node evictions. Only tuning is covered here.. Step by step installation of Oracle 10gR2 RAC on Oracle VM Server is available Here

The following are the server names:
Oracle VM Server (Dom-0): ovm.freeoraclehelp.com
Oracle RAC Node1 (VM Guest): rac1.freeoraclehelp.com
Oracle RAC Node2 (VM Guest): rac2.freeoraclehelp.com

Network Requirements
Oracle RAC Servers (VM Guests) needs at least two network interfaces. Of course, you may want to bond multiple interfaces in real/production scenario.  One for public network, which drives almost all traffic to your server (ie. connections to server, VIP, and SCAN) and another for private interconnection, which is exclusively used by Oracle RAC for inter communication traffic.

Oracle VM builds XENBridges on physical interfaces and these bridges can be used in VM guests. So, to support two interfaces at guest, we need at least two bridges on Oracle VM Server. Oracle VM Server installation creates multiple network bridges based on physical network interfaces present in the server. My server has got two network interfaces, so Oracle VM created two network bridges(xenbr0 & xenbr1) by default.

However, if you don’t have two xen bridges, you can create the second bridge by:

[root@ovm ~]# brctl addbr xenbr1
Then, assign these bridges to VM Guests with or without MAC addresses.
[root@ovm ~]# grep ^vif /etc/xen/rac?
/etc/xen/rac1:vif = [ 'bridge=xenbr0',  'bridge=xenbr1', ]
/etc/xen/rac2:vif = ['mac=00:16:3e:29:ae:b5, bridge=xenbr0', 'mac=00:16:3e:29:ae:b6, bridge=xenbr1',]
[root@ovm ~]# 
Network interface eth0 is xenbr0, eth1 is xenbr1 in this example:
[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=static
HWADDR=00:XX:XX:XX:XX:34
IPADDR=192.168.1.99
NETMASK=255.255.255.0
NETWORK=192.168.1.0
ONBOOT=yes
[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:XX:XX:XX:XX:35
ONBOOT=yes
[root@ovm ~]# 
If you have multiple interfaces at Oracle VM Server, you can bond them and use bonds for the bridges. In this example: eth0 and eth1 are teamed as bond0, which is xenbr0 in XEN ; eth2, eth3 are bond1, xenbr1, which is used for private interconnect traffic and no IP is assigned for this bond.
[root@ovm ~]# /etc/xen/scripts/network-bridges stop

[root@ovm ~]# cat > /etc/xen/scripts/network-bridge-dummy <<EOC
#!/bin/sh
/bin/true
EOF

Change /etc/xen/xend-config.sxp to call new script:
Replace
(network-script network-bridges)
with
(network-script network-bridge-dummy)
Configure bond interfaces:
[root@ovm ~]# cat /etc/modprobe.conf
alias bond0 bonding
alias bond1 bonding
..
..

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no
HWADDR=00:XX:XX:XX:XX:34

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no
HWADDR=00:XX:XX:XX:XX:35

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
BRIDGE=xenbr0
BONDING_OPTS="mode=active-backup miimon=100 downdelay=200 updelay=200 use_carrier=1"
DEVICE=xenbr0
ONBOOT=yes
STP=off
IPADDR=192.168.1.99
NETMASK=255.255.255.0
NETWORK=192.168.1.0

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth2
DEVICE=eth2
BOOTPROTO=none
ONBOOT=yes
MASTER=bond1
SLAVE=yes
USERCTL=no
HWADDR=00:XX:XX:XX:XX:36

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth3
DEVICE=eth3
BOOTPROTO=none
ONBOOT=yes
MASTER=bond1
SLAVE=yes
USERCTL=no
HWADDR=00:XX:XX:XX:XX:36

[root@ovm ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
ONBOOT=yes
BRIDGE=xenbr1
BONDING_OPTS="mode=active-backup miimon=50 downdelay=200 updelay=200 use_carrier=1"
DEVICE=xenbr1
ONBOOT=yes
STP=off
[root@ovm ~]# 
Reboot the VM Server (Dom-0) to take network changes effect.

Disk Tuning
I would recommend using non-sparse disks for Oracle RAC, as disk response is very important for RAC. Disk timeouts cause node evictions:
[root@ovm ~]# dd if=/dev/zero of=/OVS2/running_pool/rac_storage/ocr_asm.img bs=1M count=400
[root@ovm ~]# dd if=/dev/zero of=/OVS2/running_pool/rac_storage/voting_asm.img bs=1M count=400
[root@ovm ~]# dd if=/dev/zero of=/OVS2/running_pool/rac_storage/asm_data1.img bs=1M count=8196
..
..
CPU Tuning
CPU Response is very important as well. I could not add more than one vcpu during the install. So, I have updated the xen configurations to use two vcpus.
[root@ovm ~]# grep ^vcpu /etc/xen/rac?                     
/etc/xen/rac1:vcpus=2
/etc/xen/rac2:vcpus=2
[root@ovm ~]# 
Time Synchronization
Time drifting is a common problem in virtual machines (guests). RAC is time sensitive, meaning CRS evicts nodes if there is a time difference between cluster nodes.
On all RAC Nodes (VM Guests), add the following parameter to /etc/sysctl.conf  and run sysctl to make it effective immediately:
[root@rac1 ~]# grep ^xen /etc/sysctl.conf
xen.independent_wallclock=1
[root@rac1 ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
kernel.shmall = 2097152
kernel.shmmax = 2147551232
kernel.shmmni = 4096
kernel.sem = 256 32000 100 142
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 10000 65000
kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65535
xen.independent_wallclock = 1
fs.aio-max-nr = 1048576
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586
[root@rac1 ~]# 
Do not use the standard pool virtual names for the ntp servers. Use IP addresses in stead:
[root@rac1 ~]# grep ^server /etc/ntp.conf 
server 0.rhel.pool.ntp.org
server 1.rhel.pool.ntp.org
server 2.rhel.pool.ntp.org
server 127.127.1.0
[root@rac1 ~]# 

[root@rac1 ~]# nslookup 0.rhel.pool.ntp.org
Server:         192.168.1.1
Address:        192.168.1.1#53

Non-authoritative answer:
Name:   0.rhel.pool.ntp.org
Address: 199.4.29.166
Name:   0.rhel.pool.ntp.org
Address: 173.9.142.98
Name:   0.rhel.pool.ntp.org
Address: 72.26.125.125

[root@rac1 ~]# 


[root@rac1 ~]# grep ^server /etc/ntp.conf           
server 207.7.148.214
server 199.4.29.166
server 74.118.152.85
server 127.127.1.0
[root@rac1 ~]# 

[root@rac1 ~]# grep OPTIONS /etc/sysconfig/ntpd
OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -x"
[root@rac1 ~]# /sbin/service ntpd restart
CRS Configuration
Oracle RAC is installed, set the diagwait to 13 seconds to prevent evictions. It also produces more debugging info.
Stop all CRS Services:
[root@rac1 ~]# crsctl stop crs
[root@rac1 ~]# <ORA_CRS_HOME>/bin/oprocd stop
Ensure that Clusterware stack is down on all nodes by executing
[root@rac1 ~]# ps -ef |egrep "crsd.bin|ocssd.bin|evmd.bin|oprocd"
set diagwait to 13 seconds
[root@rac1 ~]# crsctl set css diagwait 13 -force
Verify and start services:
[root@rac1 ~]# crsctl get css diagwait
[root@rac1 ~]# crsctl start crs
[root@rac1 ~]# crsctl check crs
If your network is slow and you see lot of heartbeat failure, you can increase the misscount to 60 from default 30.
[root@rac1 ~]# crsctl set css misscount 60
Remove auto startup link and rather wait two minutes after the guest started and run /etc/init.d/init.crs start manually.

Even after all this, if a node still evicts,  clean up the following files before starting the cluster. If $ORA_CRS_HOME/log/<node_name>/cssd/<node_name>.pid is not empty file (0 bytes) or if its got a some value for the pid, CRS will not start.
[root@rac1 ~]# rm -f /usr/tmp/.oracle/*
[root@rac1 ~]# rm -f /tmp/.oracle/*
[root@rac1 ~]# rm -f /var/tmp/.oracle/*
[root@rac1 ~]# > $ORA_CRS_HOME/log/<node_name>/cssd/<node_name>.pid

Related Posts