An interesting problem I came across recently, completely self-inflicted as these things often are so hopefully a cautionary tale to avoid others repeating it!
I was investigating disk space usage on a customer server, trying to figure out if they could do some housekeeping or if it was time for them to invest in some more HDDs. I was left scratching my head after finding a logical file 10x larger than the physical file it was over.
My first thought was whether this was a joined logical, perhaps my space could be explained because it’s actually data from several files. Nope, single format LF, so still scratching my head as to how this could happen.
The answer turned out to be the weekly RGZPFM task the customer runs, it was set to allow cancel and not to rebuild access paths:
The ‘allow cancel’ is because the customer does have processes running pretty much 24×7 and so while they’d like to do a re-org they don’t want it getting in the way. ‘Rebuild access paths’ has also been specified as NO to again try and ensure the maximum amount of processing is done, but has the knock-on effect that it’s more likely that my LFs aren’t being fully re-organised.
For those who want a break at this juncture, there’s a rather good ITJungle article from 2014 explaining this in more detail: https://www.itjungle.com/2014/07/30/fhg073014-story03/
So now that I’ve identified my problem, what do I do about it? I could re-work the RGZ task so that it does rebuild access paths and accept the associated increased risk of conflict with other processes, but what do I do about my LFs already affected? I’d rather not wait until the PF hits the deleted record threshold again to be re-done.
I’ll pre-face here by saying the customer is on a rather old release, so none of the modern SQL services from IBM are available (I’m sure someone will point me to a Scott Forstie service that would do this without any effort on my part.)
The customer already runs a weekly DSPOBJD with a DETAIL(*SERVICE), so I’ve got a copy of that as a starting point. I could sort by filename (on the assumption my logicals will be Lxx suffix) and check whether the LF is larger than the PF. But what if my LF name is a bit more random?
In that case I’ve run a DSPFD to display record format information, that then means I can sort by the format as well as the file/library to get a more reliable match.
I can now join this back to the DSPOBJD from earlier. To exclude source physical files, I’ve chosen to go to QADBXREF with a DBXTYP = ‘D’, so my query end up looking something like this.
I’ve included RFFTYP so I can see at a glance which is the LF/PF, and the object size so I can see where LF size > PF size.
You could also do it in SQL, if that’s your preferred flavour:
select a.rfname, a.rflib, a.rffile, a.rfftyp, b.odobsz
from mnislib.dspfdfmt as a
join mnislib.dspobjdsrv as b on a.rfname = b.odobnm and a.rflib = b.odlbnm and a.rffatr = b.odobat
join qsys.qadbxref as c on a.rffile = c.dbxfil and a.rflib = c.dbxlib
where c.dbxtyp = ‘D’
order by a.rfname, a.rflib, a.rffile
You can also include/exclude specific libraries, if you know for example that all the affected PFs/LFs are in MNISLIB, or none of them are! If you’re feeling adventurous you could take the output from this and join it back to itself (record format = record format, but one side = ‘P’ and the other side = ‘L’.) If you don’t want to have to go to the system table direct you could swap out qadbxref with another DSPFD this time on *MBR rather than *RCDFMT, MBDTAT is your friend there. As it happens the customer also runs that weekly as part of the re-org task so it was my first port of call, but then fancied going direct to the source instead.
In my case I’ve saved literal GBs of disk space by doing this, so I guess I won’t be placing an order for those HDDs any time soon.
This is duly filed under “making it up as I go along” so I’d be curious to hear if anyone else has had a similar problem, or does have a better solution to it, comments box is open as ever.