Solaris 10 patching best practices

We have a mammoth task every quarter to patch all our Unix servers, and every month if the servers are in DMZ. With time we have formulated a couple of steps that makes life easier during patching of Sun servers.

1. We apply the same patch cluster on every Sun server for a particular quarter, no matter when we patch the server during a particular quarter .

2. Once the patch is downloaded at the centralized location, it is distributed to every server on the / directory. This may sound a little off track as the patch cluster consumes a good 1GB of space and the patching process might fail if we do not have sufficient disk space on our / and /var file system, but we had faced some NFS issues before where we had a lot of trouble bringing up the system after the nfs server crash in middle of the patching process.

3. We inform the database team before hand (1 week) about the patch cluster release and also share with the them release notes, for them to identify any issues with the database and the current patch cluster. If there is any set of point patch for the database that needs to be applied post server patching then they plan the production downtime in addition to ours.

4. Get the downtime approved from the Business Unit and all the changes approved through the change control board of our organization. We take a blanket downtime approval from the change control board and later follow up with the Business unit to approve the downtime as per our schedule. If their is some critical release then it might get postponed to subsequent weekend, but never post the patching quarter. In the worst case if we cannot patch the server, then this is raised as an exception with the audit team and has to be approved by business unit.

5. Before the patching starts (preferably 1 day before), we use to run a script that collects system information like vfstab, veritas information etc. We take backup of the critical system folder/files and put it on a different server. We also ensure the solaris DVD is inserted into the server, just in case we have to boot off from the DVD.

6. Send out the notification to the user about the system un availability during the patching window. This is usually comunicated by Helpdesk.

7. The implementation part is basically a sequence of steps which I will touch in brief:
  • Break the mirror, if you are using SVM. Detach it completely (metadetach, metaclear)
  • Mount the mirror and change /mnt/etc/vfstab entry from md device to native device mapping i.e /dev/dsk/c1t0d0s0 and hash out other entries
  • Comment out the rootdev entry from the system file of the mirror disk.
  • Without unmounting the mirror go to OK prompt, just to ensure boot-archive gets updated on the mirror disk as well. If you haven't run installboot on the mirror disk, please do it so to make it bootable.
  • Boot off from the mirror disk, to ensure that the second disk is bootable in case we run into any issues post patching.
  • Go back to OK prompt and boot from first disk, initially it was required to be in single user mode but now you can patch the server normal runlevel also
  • Start the patching process
  • Reboot
  • Wait for a week to system to stablize and if no issues are reported, reattach the mirror back. If their are issues then you have the second disk to boot from.
  • Uncomment out the vfstab entries and mount all the file system.
  • Send out the notification to the DBA's to start the database.

8. Update the logbook.

Comments

  1. Very helpful information summarizing steps in brief which you can not get in any blogs.

    ReplyDelete
  2. hi rupesh,
    tanks a lot.
    i was looking for the same content, procedure and i got it. I hope in IBM they use the same strategy.??
    chck out my blog too. http://unixadvice.blogspot.com
    Satheesh

    ReplyDelete

Post a Comment

Popular posts from this blog

zpool and power path partial compatibility

Recoverpoint replication reporting with vmax splitter

Weird problem post power maintenance, no Powerpath pseudo name for Luns