With the ever increasing pressure to get more from our existing infrastructures, many of you will be under pressure to extend the working day of your companies’ IT systems. This means that scheduling regular downtime to perform appropriate backup is becoming more and more of an issue.
There are of course several solutions that allow us to carry out a backup with little or no downtime at all. A couple of my favourites are SAN-based snapshotting and performing the backup on your HA server. But while these solutions are robust, they are not without significant cost.
You could also upgrade your hardware. The chances are that if you max out the RAM in your server and migrate to the latest Fibre Channel adaptor and IBM tape library, you should also get faster backups. But this may well cost as much as the aforementioned HA software.
So I thought I would go back to basics and run you through a few options that could help you reduce the downtime you need for your backups without spending a penny.
You can mix and match many of these tips and if you get even a modest improvement, then I would consider that a result. However, I’m hoping you can do better than that. Typically, if I get called into a client site and implement some or all of these techniques, I can usually half the downtime they need to run their daily backups.
A couple of these tips require specific releases or PTFs so I’ll group them based on IBM i OS level required in order to use them.
On any version of IBM i
Many folks think that the speed that a tape drive can write is solely governed by the type of drive but this is just not true. Yes, any LTO drive will save faster than a QIC model but other factors influence the speed too.
Let’s optimise your tape drive and media and start with a few basics that apply regardless of what release or hardware you are running. These are all about ensuring your system and tape drive are set up in their optimal configuration.
Check that the tape media you a using in your drive is the optimal for that drive
If you are using an LTO4 tape drive, ensure that you are backing up upon LTO4 tape media. The LTO4 tape specification allows data to be written out up to 50% faster (120MB/s) to *ULTRIUM4 (LTO4)-formatted tape as compared to (80MB/s) for an *ULTRIUM3 (LTO3).
Check that you are formatting the tapes to their optimum standards
This is less of an issue these days but it is still possible for some drives to initialise the same tape in the same drive with more than one format. As a rule of thumb, specify DENSITY(*CTGTYPE) on your INZTAP and you’ll be good shape.
Ensure you don’t have slower devices on the same controller channel as your backup device
If you are still running on SCSI-based tape drives (and I know a lot of you are) then make sure you are not daisy-chaining SCSI devices unless you absolutely have to. If you need something like a DVD drive and an LTO4 tape on the same SCSI controller, then plug them in separate ports on the controller card so as to avoid falling foul of the lowest common denominator trap.
Check your save commands are using data compaction not data compression
Data compaction is usually done in the tape drive and by a specialist hardware circuit. Data compression is carried by your IBM i processor, which, while beautiful, is not optimised for compression. Not to mention that it is busy doing all of the other tasks needed to get that data offloaded to your tape drive.
Next, let’s optimise your backup routine. Most IBM i shops still believe in doing a full user data backup of their system every day. This is great if you have time but, if not, let’s consider the options:
Separate the backup into two parts, dedicated and shared
If you do want to backup all user data every day but need to reduce the amount of time your system is down for your users, then consider separating the backup into two parts: dedicated and shared.
Dedicated is the primary backup where the data you are saving is the data that is actually waiting to be used by the system the moment the backup finishes. Once that completes, start your application environment, let your users back on and continue with the shared element of the backup routine which saves everything else on the server.
Consider when to use Save While Active
Save While Active is a really useful function but it has both limits and penalties for use. Firstly, it slows the backup down while it waits to synchronise the objects in the library or directory you are saving. If you are clever and can get everyone off that uses those objects while it does this, then you are golden.
Secondly, and sadly all too often, Save While Active doesn’t manage to save an object that was being used. That is probably why IBM didn’t call it “save while locked” or something equally ghastly. (Note to reader: Please don’t ever ask me about IBM’s nomenclature. It is so bad I think it might have been this that made me go bald in my twenties. In fact, if it wasn’t for IBM’s appalling naming standards this Wolverhampton boy wouldn’t even need to have the word “nomenclature” in his vocabulary!)
So, the advice here is review when you are using Save While Active. If you have closed down your application fully, then the data should not be in use and you should not need to use this function.
From V5R4 onwards – saving spool files
Starting with V5R4, you could save your spool files. This is a great feature and something I encourage you to do but not at the expense of stopping your users from working. So why not review when you are saving the spool files and consider moving the process to a time when your users are not waiting to get back onto your system?
From V6R1 onwards – Asynchronous Bring
Did I mention my love of IBM’s naming conventions? Well, Asynchronous Bring, or AsyncBring as it often referred to, was not delivered with V6R1 but was automatically enabled by PTFs somewhere around five years ago. It is designed to improve the performance of the SAV command, specifically when it has to save a large number of very small objects.
For example, I have a client who has over 250,000 PDFs, each less that 50Kb, stored in three directories on their IFS. Simply adding the AsyncBring(*YES) parameter to their SAV command halved the time taken to back up these files, saving them 30 minutes. Your mileage will no doubt vary but it is simple to enable, so why not give it a try?
From On V7R2 onwards – Go Save Opt21 with End TCP/IP wait time
I’m a big fan of regular Entire System Saves (Go Save, option 21). These bootable backups form the cornerstone of effectively recovering your server should the worst ever happen. They perform a complete backup of operating system, configuration, application and data all in a single backup. In an ideal world you would run one of these every day.
The problem with these saves is that you have to put the system in a restricted state (literally only the console session can be running) and this places an extra administrative burden as there is more to shutdown/startup.
You may also have noticed that since v6.1 when you start an Entire System Save, your system seems to do nothing for ten minutes before it starts and it is this delay that we will address next. When you start an Entire System Save at v6.1 and v7.1, behind the scenes your system runs the following commands:
ENDTCPSVR SERVER(*ALL) ENDHOSTSVR SERVER(*ALL) ENDACTCNN(*DATABASE *FILE) DLYJOB JOB(300) ENDTCP CONFIRM(*NO) DLYJOB JOB(300)
The two delay job statements seen above each delay the job by 300 seconds before allowing it to get on with the backup; that is ten minutes of wasted time. From v7.2 onwards, you have the option to change this TCP Wait Timer to zero seconds by specifying *NONE.
Now, ten minutes may not sound like a lot of time but I ran one of these Option 21 Entire System Saves last week on a client’s Power8 LPAR to an LTO6 tape. It saved just under 350GB in 61 minutes. So in this case just changing one parameter reduced the backup windows by 14%.
Nice to see you…
Our next i-UG meeting will take place in central London on May 19 at Arrow ECS’s building near Bank underground station. We’ve already confirmed a number of excellent guest speakers including:
* David Spurway, who will demonstrate how buying a Power8 can be cheaper than keeping your current server, as well as showing you how you could use Watson to tell you more about your customers.
* Andrew Ireland, who will be showing you the latest and greatest from Web Query.
* Kevin Askew, who will be doing a show and tell on 3D printing. This is fascinating; he brings a small army of test printers and sample prints with him. If you ever loved Lego or Meccano, this is for you.
Last, and by all means least, I will be talking about performance monitoring using the new IBM i Director Navigator interface. This is quite literally the best monitoring tool I’ve ever seen from IBM and it’s free. Check out my PowerWire article “IBM i 7.2: Finding the job that is killing your system” for more details.
Hope to see you there. You can find further info and registration at the i-UG website.
Steve Bradshaw is the founder and managing director of Wolverhampton, UK-based Power Systems specialist Rowton IT Solutions and technical director of British IBM i user group i-UG. He has been a key contributor to PowerWire since 2012 and he also sits on the Common Europe Advisory Council (CEAC) which helps IBM shape the future of IBM i.