Upgrading Integrated Manager for Lustre 5.0.0 to Lustre 2.12.2 and Integrated Manager for Lustre 5.1.0.0
Upgrade Integrated Manager for Lustre
The first component in the environment to upgrade is the Integrated Manager for Lustre server and software. The manager server upgrade can be conducted without any impact to the Lustre filesystem services.
Backup the Existing configuration
Prior to commencing the upgrade, it is essential that a backup of the existing configuration is completed. This will enable recovery of the original configuration in the event of a problem occurring during execution of the upgrade.
The following shell script can be used to capture the essential configuration information that is relevant to the Integrated Manager for Lustre software itself:
#!/bin/sh
# Integrated Manager for Lustre (IML) server backup script
BCKNAME=bck-$HOSTNAME-`date +%Y%m%d-%H%M%S`
BCKROOT=$HOME/$BCKNAME
mkdir -p $BCKROOT
tar cf - --exclude=/var/lib/chroma/repo \
/var/lib/chroma \
/etc/sysconfig/network \
/etc/sysconfig/network-scripts/ifcfg-* \
/etc/yum.conf \
/etc/yum.repos.d \
/etc/hosts \
/etc/passwd \
/etc/group \
/etc/shadow \
/etc/gshadow \
/etc/sudoers \
/etc/resolv.conf \
/etc/nsswitch.conf \
/etc/rsyslog.conf \
/etc/ntp.conf \
/etc/selinux/config \
/etc/ssh \
/root/.ssh \
| (cd $BCKROOT && tar xf -)
# IML Database
su - postgres -c "/usr/bin/pg_dumpall --clean" | /bin/gzip > $BCKROOT/pgbackup-`date +\%Y-\%m-\%d-\%H:\%M:\%S`.sql.gz
cd `dirname $BCKROOT`
tar zcf $BCKROOT.tgz `basename $BCKROOT`
Copy the backup tarball to a safe location that is not on the server being upgraded.
Note: This script is not intended to provide a comprehensive backup of the entire operating system configuration. It covers the essential components pertinent to Lustre servers managed by Integrated Manager for Lustre that are difficult to re-create if deleted.
Stopping the filesystem
IML requires that the filesystem(s) associated with each node to be upgraded must be stopped. Follow these steps:
- Navigate to Configuration->Filesystems
-
For each filesystem listed:
- Click the filesystem’s
Actions
button - Select Stop
- Click the filesystem’s
Install the Integrated Manager for Lustre Upgrade
The software upgrade process requires super-user privileges to run. Login as the root
user or use sudo
to elevate privileges as required.
-
Download the latest Integrated Manager for Lustre release repo on the manager node:
yum-config-manager --add-repo=https://github.com/whamcloud/integrated-manager-for-lustre/releases/download/v5.1.0/chroma_support.repo
-
Verify that the old iml-5.0 repo file has been removed from the repolist and that the 5.1 repo has been added on the manager node.
yum repolist
-
Update packages on the manager node.
yum clean metadata yum update
Refer to the operating system documentation for details on the correct procedure for upgrading between minor OS releases. Note that a mapping of new installs and upgrades will be displayed. Look through this chart carefully and verify that python2-iml-manager is marked for upgrade and that it will be upgraded to 5.1.0.0.
-
Run
chroma-config setup
to complete the installation. -
Perform a hard refresh on the browser and verify that IML loads correctly.
Upgrade the Lustre Servers
Lustre server upgrades can be coordinated as either an online roll-out, leveraging the failover HA mechanism to migrate services between nodes and minimize disruption, or as an offline service outage, which has the advantage of usually being faster to deploy overall, with generally lower risk.
The upgrade procedure documented here describes the faster and more reliable approach, which requires that the filesystem be stopped. It assumes that the Lustre servers have been installed in pairs, where each server pair forms an independent high-availability cluster built on Pacemaker and Corosync. Integrated Manager for Lustre deploys these configurations and uses both the stock Lustre resource agent and clusterlabs ZFS resource agent. Integrated Manager for Lustre can also configure STONITH agents to provide node fencing in the event of a cluster partition or loss of quorum.
Backup Existing Server Data
-
As a precaution, create a backup of the existing configuration for each server. The following shell script can be used to capture the essential configuration information that is relevant to Integrated Manager for Lustre managed mode servers:
#!/bin/sh BCKNAME=bck-$HOSTNAME-`date +%Y%m%d-%H%M%S` BCKROOT=$HOME/$BCKNAME mkdir -p $BCKROOT tar cf - \ /var/lib/chroma \ /etc/sysconfig/network \ /etc/sysconfig/network-scripts/ifcfg-* \ /etc/yum.conf \ /etc/yum.repos.d \ /etc/hosts \ /etc/passwd \ /etc/group \ /etc/shadow \ /etc/gshadow \ /etc/sudoers \ /etc/resolv.conf \ /etc/nsswitch.conf \ /etc/rsyslog.conf \ /etc/ntp.conf \ /etc/selinux/config \ /etc/modprobe.d/iml_lnet_module_parameters.conf \ /etc/corosync/corosync.conf \ /etc/ssh \ /root/.ssh \ | (cd $BCKROOT && tar xf -) # Pacemaker Configuration: cibadmin --query > $BCKROOT/cluster-cfg-$HOSTNAME.xml cd `dirname $BCKROOT` tar zcf $BCKROOT.tgz `basename $BCKROOT`
Note: This is not intended to be a comprehensive backup of the entire operating system configuration. It covers the essential components pertinent to Lustre servers managed by Integrated Manager for Lustre that are difficult to re-create if deleted. Make sure to backup any other important configuration files that may be on your system, such as multipath configurations.
The following files will need to be backed up if multipath is being used on the system:
/etc/multipath/* /etc/multipath.conf
-
Copy the backups for each server’s configuration to a safe location that is not on the server being upgraded.
Ensure IML 5.0 packages are latest
It is required that the installed iml-device-scanner* packages are version 2.2.2-1 and the iml-update-check package is version 1.0.2-2 for a successful upgrade.
Upgrade the repos on each storage server node
-
In order to upgrade, make sure yum is configured on each storage server node to pull down CentOS 7.7 packages.
-
Update the repos on each storage server node. As an example, consider the following hosts: mds1.local, mds2.local, oss1.local, and oss2.local:
[root@manager]# iml update_repo --hosts mds[1,2].local,oss[1,2].local
-
On each storage server, restart the
iml-storage-server.target
systemctl restart iml-storage-server.target
Run the updates
Next, navigate to the server page and proceed to update each of the servers:
- Navigate to Configuration->Servers
- Each storage server should report “Updates are ready for server X”
- Click the Install Updates button
- Select all storage servers for upgrade and begin the upgrade.
Summary
Start the filesystem once the upgrade job for each server completes. Connect a client and verify that it is able to access files on the filesystem.