Jump to content
 English      
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
     Forums advanced search
HP.com Home
IT Resource Center Forums > HP-UX > Serviceguard

SG nodes fail when starting the multi-node package - This thread has been closed

» 

IT Resource Center

» Login
» Register
» My profile
» Search knowledge base
» Forums
» Patch database
» Download drivers, software and firmware
» Warranty check
» Support Case Manager
» Software Update Manager
» Training and Education
» More maintenance and support options
» Online help
» Site map

Member icons
 
 HP moderator  HP moderator
 Expert in this area  Expert in this area
Member status
ITRC Pro ITRC Pro
250 points
ITRC Graduate ITRC Graduate
500 points
ITRC Wizard ITRC Wizard
1000 points
ITRC Royalty ITRC Royalty
2500 points
ITRC Pharaoh ITRC Pharaoh
7500 points
Olympian Olympian
20000 points
1-Star Olympian 1-Star Olympian
40000 points
2-Star Olympian 2-Star Olympian
80000 points
»  How to earn points
»  Support forums FAQs
Question status
Magical answer Magical answer
Message with a response that solved the author's question
Favorites status
Add to my favorites Add to my favorites
Delete from my favorites Delete from my favorites
This thread has been closed Thread closed
 

Content starts here
   Create a new message    Receive e-mail notification if a new reply is posted  Reply to this message
Author Subject: SG nodes fail when starting the multi-node package      Add to my favorites  This thread has been closed
Dima Kouznetsov
Oct 27, 2009 20:07:39 GMT    Attachment is 343385.txt 

Hello, I have been stuck on this one for a few days, any advise would be appreciated:

I am doing a configuration of Serviceguard with Veritas Cluster File System on HP-UX 11i v3 and Oracle 10g R2 RAC. Right now I have Serviceguard running on 2 Integrity Virtual machines which are hosted on an rx8640 server.

<root@vguest1:/>swlist | grep -i serviceguard
T1905CA A.11.19.00 Serviceguard
T2777CB A.02.01 HP Serviceguard Cluster File System for RAC
PHSS_40152 1.0 Serviceguard A.11.19.00

I loaded the latest Patches from the ITRC website for Serviceguard Storage Management Suite A.02.01 for HP-UX 11i v3.

The cluster starts and runs just fine:
<root@vguest2:/etc/cmcluster/cfs>cmviewcl -v

CLUSTER STATUS
cluster1 up

NODE STATUS STATE
vguest1 up running

Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/dsk/c0t0d0 up

Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up 0/0/1/0 lan0
STANDBY up 0/0/2/0 lan1

NODE STATUS STATE
vguest2 up running

Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/dsk/c0t2d0 up

Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up 0/0/2/0 lan0
STANDBY up 0/0/1/0 lan1

The problem comes when I create and start the package:
<root@vguest2:/etc/cmcluster/cfs>cfscluster config
CVM is now configured
<root@vguest2:/etc/cmcluster/cfs>cfscluster start
Starting CVM...

I attached the txt document of the error output from there. But basically the second node errors saying a crucial package failed, and then the second one goes down with the same error, and both nodes reboot.

Any ideas for what may be causing this, or what I can try to fix it? I will attach the SG-CFS-pkg.log in my next post (this only lets me attach one file it looks like).

Thanks you for your help,
Dima
Note: If you are the author of this question and wish to assign points to any of the answers, please login first.For more information on assigning points ,click here


Sort Answers By: Date or Points
Dima Kouznetsov
Oct 27, 2009 20:08:52 GMT    N/A: Question Author Attachement is 343370.log 

Attached is the SG-CFS-pkg.log

Dima
Steven E. Protter This member has accumulated 80000 or more points
Oct 27, 2009 20:12:01 GMT  1 pts

Shalom,

Cluster works okay without this package?

Is it possible to run through another cmquerycl/cmcheckconf/cmapplyconf series on this system.

It would be nice to see what the package control script looks like versus the cluster configuration script.

SEP
Dima Kouznetsov
Oct 27, 2009 20:27:51 GMT    N/A: Question Author Attachement is 343386.txt 

Thanks for the quick reply Steven. I attached the output from the commands that create the cluster from the beginning. After those commands is where I try to create the package and the error occurs:

# cfscluster config
# cfscluster start

Thanks,
Dima
freddy This member has accumulated 1000 or more points
Oct 28, 2009 03:47:36 GMT  2 pts

did you run vxinstall for install license for vxvm?

Regards
Freddy
Dima Kouznetsov
Oct 28, 2009 14:24:56 GMT    N/A: Question Author

I ran vxinstall on both the nodes. I am able to view the presented disks with vxdisk list.
HP moderator Stephen Doud Expert in this area This member has accumulated 2500 or more points
Oct 28, 2009 14:25:00 GMT  4 pts

Create a 1-node cluster on the vguest2 host and attempt to build the SG-CFS-pkg. If it starts without failure, focus your attention on the other server that TOCs.

I have only seen CVM/CFS TOCs occur when CFS-related filesets are not properly patched. Check (swlist) the affected server for properly installed patches to the CFS (vx*) filesets.
The filesets that affect CFS are:

Also insure it is using the same version of VxVM (4.0/5.0 etc) as vguest2.
Dima Kouznetsov
Oct 28, 2009 15:05:29 GMT    N/A: Question Author

Stephen, thanks for your suggestion.

I tried to create the cluster on vguest2 with just itself as the node and run the package, and it also crashed. It gave this error message:

^GMessage from syslogd@vguest2 at Wed Oct 28 10:34:05 2009 ...
vguest2 cmcld[25315]: Reason: A crucial package failed
Executing "/usr/sbin/cmrunpkg -v SG-CFS-pkg"
Unable to retrieve package status for package SG-CFS-pkg
cmrunpkg: Unable to start some package or package instances
Error: Failed to start CVM

So it looks like CVM is failing to start, which is something new I see. I also tried to do the same thing but with vguest1, and it gave the same error. Both of these machines have the same software loaded on them, and same patch level. I suspect you may be right with the patches, but I have installed everything I can think of.

Here is the list of CFS patches installed:

<root@vguest2:/etc/cmcluster/cfs>swlist -l patch | grep -i cfs
# Cluster-CFS A.11.19.00 HP SG Cluster CFS SD Product
# Cluster-CFS.CFS-ADMIN A.11.19.00 Cluster File System Admin SD Product
# Cluster-CFS.CFS-MAN A.11.19.00 CFS MAN pages
# PHSS_40152.CM-CVM-CFS 1.0 CVM CFS Package Fileset applied
# PHSS_40152.CM-CVM-CFS-COM 1.0 CVM CFS Package Fileset applied
# Package-CVM-CFS A.11.19.00 HP SG Cluster CVM CFS SD Product
# Package-CVM-CFS.CM-CVM-CFS A.11.19.00 CVM CFS Package Fileset
PHSS_40152.CM-CVM-CFS 1.0 CVM CFS Package Fileset applied
# Package-CVM-CFS.CM-CVM-CFS-COM A.11.19.00 CVM CFS Package Fileset
PHSS_40152.CM-CVM-CFS-COM 1.0 CVM CFS Package Fileset applied

Also, I don't know if this means anything, but I keep seeing these error messages in the SG-CFS-pkg.log:

10/28/09 10:34:04 cmrunserv -r 5 SG-CFS-sgcvmd >> /etc/cmcluster/cfs/SG-CFS-pkg.
log 2>&1 /etc/cmcluster/cfs/SG-CFS-sgcvmd.sh
10/28/09 10:34:04 List current imported disk groups:
10/28/09 10:34:04 SG configuration monitor started (poll_interval = 10)
/etc/cmcluster/cfs/SG-CFS-vxfsckd.sh[53]: /usr/sbin/vxfsckd: not found.
10/28/09 10:34:04 ERROR: Unable to determine pid of vxfsckd

...and...

10/28/09 10:32:37 Starting VXFEN
10/28/09 10:32:42 Enabling cluster ODM
10/28/09 10:33:48 /sbin/init.d/odm start (exit=1)
ERROR: The module 'odm' has a dependency on the module 'fdd' that
cannot be satisfied.

Also, I think you were going to list the filesets that affect CFS in your post, but it didn't show up, could you please list those again.

Thanks for all the suggestions,
Dima
HP moderator Stephen Doud Expert in this area This member has accumulated 2500 or more points
Oct 29, 2009 12:22:38 GMT  5 pts

Hello Dima,

Your SG-CFS-pkg.log contains very important clues to why it's not working.

First:
# ll /usr/sbin/vxfsckd
-r-xr-xr-x 1 root sys 285184 May 21 2005 /usr/sbin/vxfsckd

# swlist -l file | grep vxfsckd
PHSS_37601.CM-CVM-CFS-COM: /etc/cmcluster/cfs/SG-CFS-vxfsckd.sh
Package-CVM-CFS.CM-CVM-CFS-COM: /etc/cmcluster/cfs/SG-CFS-vxfsckd.sh
VRTSvxfs.VXFS-RUN: /opt/VRTS/bin/vxfsckd
VRTSvxfs.VXFS-RUN: /usr/sbin/vxfsckd

If you don't see the file - it's not installed and likely neither is the fileset which CVM needs!

2) The messages you see about ODM dependency on fdd not being satisfied, reiterates the problem. Apparently not all required filesets have been installed.


As for the list of filesets that are supplied with T2775CB (HP Serviceguard Cluster File System), it is too long for this forum. If you would like to get it, please email me and I will send you the list. But since ODM has unsatisfied dependencies, it's very likely the SG product bundle did not install properly.

As for patches - please use this website to get a list and download currently recommended patches for the version of SMS (storage management suite) installed on your servers:
http://www.hp.com/go/sgsms/patches
Dima Kouznetsov
Oct 29, 2009 13:35:41 GMT    N/A: Question Author

Thank you for your reply, that definitely makes sense now why they keep crashing. I am missing the /usr/sbin/vxfsckd file.
...and...
<root@vguest2:/>swlist -l file | grep vxfsckd
PHSS_40152.CM-CVM-CFS-COM: /etc/cmcluster/cfs/SG-CFS-vxfsckd.sh
Package-CVM-CFS.CM-CVM-CFS-COM: /etc/cmcluster/cfs/SG-CFS-vxfsckd.sh

I do remember that certain components had errors when installing Serviceguard CFS for RAC. I am not exactly sure why thought. I am installing the newest version of the HP-UX OS I have, 11.31.0909. Previously I was working on 11.31.0709. I will reinstall Serviceguard, and let you know if that fixes the missing /usr/sbin/vxfsckd file. Is there a way to install only that component?
HP moderator Stephen Doud Expert in this area This member has accumulated 2500 or more points
Oct 29, 2009 19:55:23 GMT  2 pts

Dima,
Your swlist indicates that there is a patch (PHSS_40152) installed on vxfsckd.sh (which is part of the VRTSvxfs fileset).
You might be able to drill into the T2777CB product (in swinstall GUI) and reinstall just VRTSvxfs but you will also have to reinstall the patch too. I don't experiment with such things, so I just don't know for certain. I hope it goes well for you.
Dima Kouznetsov
Oct 30, 2009 18:11:45 GMT    N/A: Question Author

Ok, so I reinstalled one of the machines with 11.31.0909 and loaded the latest SG CFS for RAC, and installed the latest patches from the itrc website. I am still getting the package failed TOC, but it seems like something else it messing up. From the message below it says that the LLT failed to start, but when my system boots, the start-up progress shows that LLT is started. However, start-up progress shows that Starting VXFEN ......FAIL*. Before I installed some patches for VXFEN, the package syslog had an error message about vxfen failing, but after installing patches that went away (but it still fails on the start-up progress)...I am confused. Is there a patch for LLT that I also need to install outside the bundle?

10/30/09 13:35:57 ########### Node "vguest1": Starting package ###########
10/30/09 13:35:57 Starting service SG-CFS-vxconfigd
10/30/09 13:35:57 cmrunserv SG-CFS-vxconfigd >> /etc/cmcluster/cfs/SG-CFS-pkg.lo
g 2>&1 /etc/cmcluster/cfs/SG-CFS-vxconfigd.sh
10/30/09 13:35:57 Cleaning up any old GAB/LLT configuration
10/30/09 13:35:57 /etc/cmcluster/cfs/vx-modules clean
10/30/09 13:35:58 Cleaning up old LLT/GAB
10/30/09 13:35:58 ERROR: Unable to reset GAB
10/30/09 13:35:58 rm -f /etc/llttab /etc/llthosts /etc/gabtab
10/30/09 13:35:58 Starting service SG-CFS-cmvxpingd
10/30/09 13:35:58 cmrunserv SG-CFS-cmvxpingd >> /etc/cmcluster/cfs/SG-CFS-pkg.lo
g 2>&1 /usr/lbin/cmvxpingd -t 643
10/30/09 13:35:58 rm -f /var/adm/cmcluster/cmvxd.socket
10/30/09 13:35:58 Starting service SG-CFS-cmvxd
10/30/09 13:35:58 cmrunserv SG-CFS-cmvxd >> /etc/cmcluster/cfs/SG-CFS-pkg.log 2>
&1 /usr/lbin/cmvxd run -s /var/adm/cmcluster/cmvxd.socket -t 643
10/30/09 13:35:58 Creating LLT configuration
10/30/09 13:35:58 Monitoring vxconfigd (pid= 363) every 20 secs
10/30/09 13:35:58 mktemp -d /etc
10/30/09 13:35:58 touch /etc/003890
10/30/09 13:35:58 chmod 644 /etc/003890
10/30/09 13:35:58 chmod 444 /etc/003890
10/30/09 13:35:58 mv /etc/003890 /etc/llttab
10/30/09 13:35:58 touch -r /etc/cmcluster/cfs/.SG-CFS-pkg.ref /etc/llttab
10/30/09 13:35:58 Creating GAB configuration
10/30/09 13:35:58 mktemp -d /etc
10/30/09 13:35:58 touch /etc/003915
10/30/09 13:35:58 chmod 644 /etc/003915
10/30/09 13:35:58 chmod 444 /etc/003915
10/30/09 13:35:58 mv /etc/003915 /etc/gabtab
10/30/09 13:35:58 touch -r /etc/cmcluster/cfs/.SG-CFS-pkg.ref /etc/gabtab
10/30/09 13:35:58 chmod 544 /etc/gabtab
10/30/09 13:35:58 Creating initial LLT hosts file
10/30/09 13:35:58 mktemp -d /etc
10/30/09 13:35:58 touch /etc/003935
10/30/09 13:35:58 chmod 644 /etc/003935
10/30/09 13:35:58 chmod 444 /etc/003935
10/30/09 13:35:58 mv /etc/003935 /etc/llthosts
10/30/09 13:35:58 touch -r /etc/cmcluster/cfs/.SG-CFS-pkg.ref /etc/llthosts
10/30/09 13:35:58 Starting Veritas stack
10/30/09 13:35:58 /etc/cmcluster/cfs/vx-modules start
10/30/09 13:35:58 Starting LLT
10/30/09 13:35:58 /sbin/init.d/llt start
LLT lltconfig ERROR V-14-2-15040 node ID is already set, use -o to override
10/30/09 13:35:58 /sbin/init.d/llt start (exit=1)
10/30/09 13:35:58 ERROR: Failed to start LLT
10/30/09 13:35:58 ERROR: Could not start veritas stack
10/30/09 13:35:58 ########### Node "vguest1": Package script failed ###########
Michael Steele Expert in this area This member has accumulated 7500 or more points
Oct 30, 2009 20:59:59 GMT  2 pts

Hi

Everything that I have read on this error:

"...ERROR: Could not start veritas stack..."

ends up as an incomplete patched node in the cluster.

ALL NODES MUST BE IDENTICAL.

http://docs.hp.com/en/T2771-90028/ch01s04.html#d0e2254
Dima Kouznetsov
Oct 30, 2009 21:03:36 GMT    N/A: Question Author

Right now I am just trying to get this setup to work on one server (ie cluster with one node). I made sure that I downloaded the latest patches from the HP website, at this point the only thing I can think of trying is going back to the "recommended" or the older patches instead of the newest patches that I got from the website. Although I don't think that will fix my issue, I'm going to give that a shot.
Dima Kouznetsov
Nov 3, 2009 18:46:01 GMT    N/A: Question Author

Ok, FINALLY got it up and working. Here is what was wrong with my setup just in case anyone else runs into this issue:

I went back and checked the install of HP Serviceguard Cluster File System for RAC, and saw that not all of the components got installed, there were some errors during the install. After reading the error messages, the system was missing Base-VxTools-50 and SGeRAC (don't know why it needed it, is that normal???). After installing those two, I went back and reinstalled SGCFS for RAC, and it completed without errors. I also applied all the newest patches for SG, SGCFSRAC, VXVM. The main underlying issue was that VXFEN was failing on system startup. After the complete install and reboot it finally started.

Now, right now it is working on one node (but still huge progress), I will need to go and reinstall the second node to match it and then will try to get the two working together. I will close the thread as soon as I have both up and running. Thanks for all the help!

Dima
Dima Kouznetsov
Nov 23, 2009 21:02:35 GMT   Thread closed by author  

The patches definitely did the trick, what baffles me is why those patches weren't listed in the SG installation document. Guess they assume certain things about your system before the install.
 
Create a new message    Receive e-mail notification if a new reply is posted   Reply to this message
 
 
Printable version
Privacy statement Using this site means you accept its terms
© 2009 Hewlett-Packard Development Company, L.P.