I have couple of these – how to configure it pretty easy to found (may be later I’ll go over the setup here too). The problem is – with everything else running perfectly fine OCFS2 partitions are never mounted automagically on boot. Just recently I found out why
In SysV type init scripts (RedHat,Debian, whole bunch of other Linxu distros and Solaris use it) – each startup script has specific precedence defined by the number. For example runlevel 2 (system + networking) for the CentOS in question described in /etc/rc2.d, looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
[root@serverYYY ~]# ls -la /etc/rc.d/rc2.d total 220 drwxr-xr-x 2 root root 4096 Sep 9 06:13 . drwxr-xr-x 10 root root 4096 Jun 15 10:55 .. lrwxrwxrwx 1 root root 17 Jun 15 10:55 K01dnsmasq -> ../init.d/dnsmasq lrwxrwxrwx 1 root root 18 Jun 25 03:41 K01yum-cron -> ../init.d/yum-cron lrwxrwxrwx 1 root root 22 Jul 9 06:10 K02avahi-daemon -> ../init.d/avahi-daemon lrwxrwxrwx 1 root root 24 Jul 9 06:10 K02avahi-dnsconfd -> ../init.d/avahi-dnsconfd lrwxrwxrwx 1 root root 13 Jun 15 10:55 K05atd -> ../init.d/atd lrwxrwxrwx 1 root root 19 Jul 5 02:52 K10dc_server -> ../init.d/dc_server lrwxrwxrwx 1 root root 16 Jun 15 10:56 K10psacct -> ../init.d/psacct lrwxrwxrwx 1 root root 19 Jul 5 02:52 K12dc_client -> ../init.d/dc_client lrwxrwxrwx 1 root root 15 Jun 25 03:41 K15httpd -> ../init.d/httpd lrwxrwxrwx 1 root root 18 Jul 9 09:03 K15lighttpd -> ../init.d/lighttpd lrwxrwxrwx 1 root root 20 Jun 15 10:55 K44rawdevices -> ../init.d/rawdevices lrwxrwxrwx 1 root root 20 Jun 15 10:55 K50netconsole -> ../init.d/netconsole lrwxrwxrwx 1 root root 15 Sep 9 06:13 K50snmpd -> ../init.d/snmpd lrwxrwxrwx 1 root root 19 Sep 9 06:13 K50snmptrapd -> ../init.d/snmptrapd lrwxrwxrwx 1 root root 14 Jun 15 11:54 K74nscd -> ../init.d/nscd lrwxrwxrwx 1 root root 14 Jun 15 10:56 K74ntpd -> ../init.d/ntpd lrwxrwxrwx 1 root root 15 Jun 15 10:55 K75netfs -> ../init.d/netfs lrwxrwxrwx 1 root root 15 Jun 15 10:55 K80kdump -> ../init.d/kdump lrwxrwxrwx 1 root root 15 Jun 15 10:56 K85mdmpd -> ../init.d/mdmpd lrwxrwxrwx 1 root root 20 Jun 15 10:54 K87multipathd -> ../init.d/multipathd lrwxrwxrwx 1 root root 18 Jun 15 10:54 K89netplugd -> ../init.d/netplugd lrwxrwxrwx 1 root root 17 Sep 9 06:13 K89openibd -> ../init.d/openibd lrwxrwxrwx 1 root root 15 Jun 15 10:53 K89rdisc -> ../init.d/rdisc lrwxrwxrwx 1 root root 15 Jun 25 03:41 K95kudzu -> ../init.d/kudzu lrwxrwxrwx 1 root root 25 Jun 15 10:56 K99readahead_later -> ../init.d/readahead_later lrwxrwxrwx 1 root root 23 Jun 15 10:55 S00microcode_ctl -> ../init.d/microcode_ctl lrwxrwxrwx 1 root root 22 Jun 15 11:52 S02lvm2-monitor -> ../init.d/lvm2-monitor lrwxrwxrwx 1 root root 25 Jun 15 10:56 S04readahead_early -> ../init.d/readahead_early lrwxrwxrwx 1 root root 18 Jun 15 10:52 S08iptables -> ../init.d/iptables lrwxrwxrwx 1 root root 18 Jun 15 10:55 S08mcstrans -> ../init.d/mcstrans lrwxrwxrwx 1 root root 17 Jun 15 10:55 S10network -> ../init.d/network lrwxrwxrwx 1 root root 16 Jun 15 10:54 S11auditd -> ../init.d/auditd lrwxrwxrwx 1 root root 21 Jun 15 10:57 S12restorecond -> ../init.d/restorecond lrwxrwxrwx 1 root root 16 Jun 15 11:54 S12syslog -> ../init.d/syslog lrwxrwxrwx 1 root root 20 Jun 15 10:55 S13irqbalance -> ../init.d/irqbalance lrwxrwxrwx 1 root root 15 Jun 25 03:41 S13named -> ../init.d/named lrwxrwxrwx 1 root root 19 Jun 25 12:58 S15mdmonitor -> ../init.d/mdmonitor lrwxrwxrwx 1 root root 14 Jul 9 08:52 S24o2cb -> ../init.d/o2cb lrwxrwxrwx 1 root root 15 Jul 9 06:10 S25ocfs2 -> ../init.d/ocfs2 lrwxrwxrwx 1 root root 20 Sep 9 06:13 S26lm_sensors -> ../init.d/lm_sensors lrwxrwxrwx 1 root root 14 Jun 15 10:57 S55sshd -> ../init.d/sshd lrwxrwxrwx 1 root root 14 Jul 9 06:10 S56cups -> ../init.d/cups lrwxrwxrwx 1 root root 16 Jun 25 03:41 S64mysqld -> ../init.d/mysqld lrwxrwxrwx 1 root root 17 Jun 25 03:41 S65dovecot -> ../init.d/dovecot lrwxrwxrwx 1 root root 14 Jul 9 08:52 S70drbd -> ../init.d/drbd lrwxrwxrwx 1 root root 19 Sep 9 06:13 S75heartbeat -> ../init.d/heartbeat lrwxrwxrwx 1 root root 22 Jun 25 03:41 S78spamassassin -> ../init.d/spamassassin lrwxrwxrwx 1 root root 18 Jun 15 10:56 S80sendmail -> ../init.d/sendmail lrwxrwxrwx 1 root root 13 Jun 15 10:52 S85gpm -> ../init.d/gpm lrwxrwxrwx 1 root root 15 Jun 15 10:55 S90crond -> ../init.d/crond lrwxrwxrwx 1 root root 17 Jun 15 10:53 S95anacron -> ../init.d/anacron lrwxrwxrwx 1 root root 19 Jun 25 03:41 S95saslauthd -> ../init.d/saslauthd lrwxrwxrwx 1 root root 11 Jun 15 10:55 S99local -> ../rc.local |
S means Start, K – Kill, number describes the number in the execution sequence – the lower the number the sooner script will be executed when system is switched into this level (that means on boot, or manually via telinit). K-scripts are executed when system is leaving this runlevel (shutdown, or manual), following the same rules.
What we are interested in particular are – DRBD – the low level block devices and OCSF2 – cluster file system that is residing on these devices.
1 2 3 |
[root@serverYYY ~]# ls -la /etc/rc.d/rc2.d| grep -iE 'ocfs2|drbd' lrwxrwxrwx 1 root root 15 Jul 9 06:10 S25ocfs2 -> ../init.d/ocfs2 lrwxrwxrwx 1 root root 14 Jul 9 08:52 S70drbd -> ../init.d/drbd |
For some reason precedence is incorrect. First, system is trying to mount OCFS2 file systems, then start DRBD network block devices.
How can we fix this?
We can try to decrease DRBD sequence number but then we don’t want system to attempt to start it before networking. So, more pratcical wound be to increase OCFS2 sequence number, so it will be started AFTER DRBD.
mv S25ocfs2 S75ocfs2
We should also repeat the procedure for the runlevel 3 (system with networking and network filesystems) and runlevel 4 and 5 ( just to be consistent).
cd /etc/rc.d; for i in 2 3 4 5; do cd rc$i.d; mv S25ocfs2 S75ocfs2; cd ..; done
Now, on boot, system will try to start DRBD and then attempt to mount OCFS2 filesystem, located on this network block device.
But wait, there is more. What happens on system shutdown is the same sequentual subsystems stop, what will happen if DRBD will be stopped BEFORE OCFS2 will be unmounted?
Right, FS will be left in inconsistent state and on next boot most likely you will face the problems.
Lets check if this is true
1 2 3 4 5 6 7 8 9 |
[root@serverYYY rc.d]# find . -name K\*drbd -o -name K\*ocfs2 ./rc4.d/K08drbd ./rc0.d/K19ocfs2 ./rc0.d/K08drbd ./rc1.d/K19ocfs2 ./rc1.d/K08drbd ./rc6.d/K19ocfs2 ./rc6.d/K08drbd ./rc5.d/K08drbd |
This is exactly what happen. First we stop DRBD and then OCFS2 becomes inconsistent. This problem has to be corrected the same way – we need to decrease OCFS2 script sequience number so it will be stopped BEFORE DRBD, not after (there is no point to try to correctly unmount FS when block device is gone, isn’t it?).
for i in 0 1 6; do cd rc$i.d; mv K19ocfs2 K03ocfs2;cd ..; done
Now, there is a chance that OCFS2 will be started properly after DRBD, and stopped before it, not the other way around.
0 Comments.