Our cluster file system is fortunately GPFS. This is meaning I have high data reliability and speed but I can’t manage it in a traditional way. Probably I will post about this subject more the more I learn about hot to fix the troubles I face. I get the system up and running, with my GPFS folders mounted remotely by the MegaSysAdmin that manages all the HPC nodes of our society. On my side, I need to install some rpm packages (gpfs_220.127.116.11), then update, uninstall the kernel module and re-install the modern packages (gpfs_18.104.22.168). It does sound indeed redundant for me, but I don’t discuss and the procedure seems to work fine when scripted. On the MegaSysAdmin side, the node needs to be added to the node list on the BB manager, but that he does up to now.
So my node109 became unstable, or irresponsible, or it had a bad day and needed to be rebooted. Then what? Apparently, in general, gpfs mounts dont start automatically after reboot. After the reboot and finding out my /data GPFS is not there, I try what I think must be tried first:
root@node109 ~ ## > mount /data mount: unknown filesystem type 'gpfs' root@node109 ~ ## > systemctl start gpfs.service root@node109 ~ ## > mount /data mount: unknown filesystem type 'gpfs'
without luck. Then I go for the specifics. I check what is on /etc/fstab. There it is the GPFS file system, and it reads something like this:
/dev/data /data \ gpfs rw,relatime,nosuid,nodev, \ dev=core.domain:data,ldev=data,noauto 0 0
Let’s see what the gpfs command do:
root@node109 ~ ## > mmfsmount /dev/data Incorrect parameter: /dev/data. root@node109 ~ ## > mmfsmount /data Incorrect parameter: /data. root@node109 ~ ## > /usr/lpp/mmfs/bin/mmunmount /data some-date-here CEST 2016: mmunmount: Unmounting file systems ... root@sbnode109 ~ ## > /usr/lpp/mmfs/bin/mmmount /data some-date-here CEST 2016: mmmount: Mounting file systems ... mmremote: GPFS is not ready to handle commands yet. mmmount: Command failed. Examine previous error messages to determine cause.
And then I found the the link already posted above and type:
root@node109 ~ ## > /usr/lpp/mmfs/bin/mmstartup Fri Sep 16 14:49:38 CEST 2016: mmstartup: Starting GPFS ... root@node109 ~ ## > /usr/lpp/mmfs/bin/mmmount /data Fri Sep 16 14:49:41 CEST 2016: mmmount: Mounting file systems ...
et voila! GPFS file system successfully mounted! And one problem less in my list of neverending-never-visible problems.