With IBM i being as robust as it is, for most of us damaged objects are a rare thing but the problem is that they tend to turn up at the most inopportune moments. So, in this article I wanted to share a little known tool called areVerify that IBM created to detect damaged objects whilst the system is still being used as normal.
As the names suggests, the tool is part of the Application Runtime Expert (ARE) suite but the good news is that the part I’m about to share with you ships for free and is part of the base operating system from v6.1 but you may need some PTFs if your running before 7.2, more details on these PTFs later.
If you have the full ARE product you can run or schedule this from the graphical interface but for now I’m going to assume you don’t and show you how to use it from the command line.
How do objects get damaged?
Well there are many reasons but two of the most common are:
Power failures or abnormal shutdowns
IBM i keeps a large amount of data in memory, this is one of the key benefits of the operating system and it is very careful about looking after them but if a system is shut down abnormally, perhaps due to a utility power failure, then the OS cannot always protect all of the objects.
If a job is rebuilding an object and fails midway through the process, this too can cause the object to get damaged. This is not common with data files because of the extra protection afforded by DB2 for i but can happen to logical views and more commonly to program objects.
What can I do to repair a damaged object?
Most of the time these can be fixed by recreating or restoring the object in question and it’s seldom that this means data loss but it’s easier to fix things sooner rather than later. Most users don’t find out about damaged objects until they come to use them and, sod’s law being what it is, this is usually at the most inconvenient of times.
What I recommend is a regular, automated, off peak check of your entire system run weekly or monthly as part of your standard housekeeping process. That way, if there is an issue, you will know about it sooner and can schedule a more convenient time to fix it.
Imagine that you’ve had a new version of your month-end routine loaded on your system, it’s been thoroughly tested on a development system and you are confident that it will be just fine when run on your production server. Unbeknownst to you, the job that transferred the programs to your production system corrupted one of the executables and the first time you find out about it is in the middle of the period end routine – doesn’t sound like fun to me!
If we had a tool that routinely checked the entire system for damaged objects, then this program could be fixed well in advance of the month end routine so no users would ever know and no routines would have to be fixed.
You can do just that with this tool, it allows you to run a storage verification against your entire system, checking every object, even the operating system and all whilst they are in use. Furthermore, it can be easily scheduled and it writes the results into a simple to understand log file.
How do you run it?
The command runs under Qshell but don’t let that put you off, it’s super simple to run and the results are easy to understand.
Start QSHELL with the STRQSH command
From this command line issue the following command to run the tool against all disks: /QIBM/ProdData/OS/OSGi/templates/bin/areVerify.sh -storage diskUnits=*ALL
If you run this interactively, you will be asked to confirm that you wish to start by entering a Y. The command can take a while to run depending on how much disk you have and how many objects you have stored upon them.
ProTip: If you think it might take a long time on your system, I would suggest you run it in batch, sample command:
SBMJOB CMD(STRQSH CMD(‘/QIBM/ProdData/OS/OSGi/templates/bin/areVerify.sh -storage confirm=true diskUnits=*ALL’)) JOB(CHK4DAMAGE)
ProTip: As this command is checking every object on the system, it does generate a significant disk I/O workload, so I would avoid running the command at your peak times.
Approximately every 10 minutes the command should update you with progress as shown below:
When the command completes the screen updates showing if any objects are in need of attention, hopefully you will get one like the sample below saying “No storage error found”
ProTip: If you plan to run this regularly as part of a maintenance regime, I would recommend you create a CL program to launch the command in batch and email you the resulting log file ‘/tmp/areDodReport.txt’
Targeting a specific disk unit or segment
There are a number of parameters you can specify on this command to customise how it runs. For example if you had a disk unit swapped out to fix a failure, you might want to run this tool over a specific disk using a command like
/QIBM/ProdData/OS/OSGi/templates/bin/areVerify.sh -storage diskUnits=4
You must specify either the diskUnits or checkSegments parameter, syntax below:
• diskUnits= – Description: Specifies which disk units to check.
Examples: diskUnits=1 diskUnits=1,2,4 diskUnits=*ALL
• checkSegment= – Description: Check a specific segment by its address.
Prerequisites for running the tool
The tools ships with the operating system but there are a few prerequisites needed to allow it to run
IBM i 7.1 or higher
• 5770SS1 option 3 – Extended Base Directory Support
• 5770SS1 option 30 – QShell
• 5770SS1 option 33 – PASE
• 5761JV1 option 5 or 6 or 8 or 11 (Newer JVMs also supported)
IBM i 6.1
• 5760SS1 option 3 – Extended Base Directory Support
• 5760SS1 option 30 – QShell
• 5760SS1 option 33 – PASE
• 5761JV1 option 8 or 11 – J2SE (5 or 6) 32 bit Required
PTFs for v7.2 and above
PTFs for v7.1 shipped as part of Cumulative package C3298710 or higher
• SI50374, MF56898, MF56876, SI45469
PTFS for V6.1.0 shipped as part of Cumulative package C4197610 or higher
• SI45499, SI51025, SI30796, MF57435, MF57436
PTFS for v6.1.1 shipped as part of Cumulative package C4197610 or higher
• SI45499, SI51025, SI30796, MF57425, MF57426
Requirements to run the tool
The tool needs to be able to access every object on the system in order to check it, so the profile you run it under must have *ALLOBJ, if in doubt you can always use QSECOFR to start with.
The tool does not run if the job’s CCSID is 65535, so if this is the default value on your system for system value QCCSID, then you will need to change this value for the user profile running the job (CHGUSRPRF CCSID) or for the job running the command CHGJOB CCSID
Other optional Parameters
• skipDirDump= Description: if true is specified, the directory dump phase will be skipped.
If you have run the tool before and there’s no changes in storage you care about, use this parameter to skip the directory dump phase, which takes a long time.
Default value: false
• dbName= – Description: Specifies the temporary library name to store the directory dump data.
This parameter is only for directory dump phase.
Default value: QTMPAREDDD
• dirType= Description: Identifies the directories for which data should be collected.
This parameter is only for directory dump phase. The possible values for this parameter are:
T: Temporary, P: Sysbase permanent & user ASPs, I: Independent ASP
Default value: P
• iASP= – Description: identifies the IASP number if ‘I’ was selected for the “Directory identifier” parameter.
If a value other than ‘I’ was selected for the “Directory identifier” parameter this parameter is ignored. This parameter is only for directory dump phase.
Default value: 0
• jobQueue= – Description: identifies the name of the job queue which will be used for the background jobs which collect the requested data. This parameter is only for directory dump phase.
Default value: QCTL
• jobQueueLib= – Description: identifies the library of the job queue which will be used for the background jobs which collect the requested data. This parameter is only for directory dump phase.
Default value: QSYS
• jobCount= – Description: identifies the number of background jobs which will be used to collect the requested data. This value must be a number between 5 and 100. This parameter is only for directory dump phase.
Default value: 30
• skipPageVerification= – Description: specifies whether the Page Verification phase will be performed. Default value: true
• threadCount= – Description: Specifies the thread count for Page Verification.
This value must be a number between 1 and 100.
• op=[check | clear] – Description: Operation mode, used only when parameter diskUnits is specified.
check: check segment error in specified disk units.
clear: clear error flags in free space of the specified disk units.
Default value: check
• statusUpdateInterval=: minutes of status update – Description: Specifies the status update interval, in minutes. The status message will be written to console, and/or log file.
This value must be a number between 1 and 1440.
Default value: 10
• outputFile= – Description: Specifies the log file
Default value: /tmp/areDodReport.txt
• confirm= – Description: Controls whether user confirmation is required before starting the task.
If true, no user confirmation prompt is shown.
Default value: false
• version – Description: Show version of the tool
Nice to see you
It was great to see so many of you at Wyboston Lakes for our last user group meeting. I’m always amazed by your energy, enthusiasm, ideas and ability to drink free beer!
Our next event will be in my home town of Wolverhampton on Wednesday 19th October, this event is free to i-UG members and only £50 if you’re not. Hope to see you there, more details and registration information available at www.i-ug.co.uk