Editing RPGLE source members with Vim

I am very fond of the Vim text editor and will use it, out of preference, whenever possible. This includes writing RPG (and, obviously, RPGLE) programs on the IBM i. The only downside here is that, while Vim supports syntax highlighting for a multitude of languages, RPG isn’t one of them.

For a long time, I have relied on the syntax files written by Martin Rowe back in 2014. These, however, are getting more than a little long in the tooth. So, since I have a little time on my hands, I have started putting together a new set of syntax files which, hopefully, will be able to keep up with any new developments.

It’s all very early days so far, but you can find the Vim RPG Syntax highlighter here.

Feel free to download it and use it. And if you have any improvements to make or suggest, I would love to hear from you.

On indicators

RPG has, since the beginning of time, supported a set of indicators — one byte characters that can be either *on or *off. These are popular but really shouldn’t be used any more. Being named *IN01 to *IN99 makes for indicators whose function is unclear and you are much better off using built in functions and explicitly defined fields.

There is an exception, of course. When dealing with display files, you have no choice but to use indicators to determine what keys have been pressed. Even here, though, it is a good idea to overlay the indicators with human readable field names.

This is how I do it:

 * Global variables
d IndicatorPtr    s               *   inz(%addr(*in))
d                 ds                  based(IndicatorPtr)
d iExit                   3      3
d iRefresh                5      5
d iAdd                    6      6
d iCancel                12     12
d iValidCmdKey           25     25
d iError                 30     30
d iPageDown              26     26
d iSflInit               80     80
d iSflEnd                81     81
d iSflEmpty              82     82
d iSflPosCursor          83     83
d iSflDeleted            84     84
d iProtectKey            90     90
d iProtectData           91     91

So *IN03 is mapped to field iExit, *IN05 is mapped to field iRefresh, and so on.

With the indicators defined in this way, I can write code like this:

p Main            b
d Main            pi
 /free

     // Identify the current program
     program = RtvProgram();

     // Main processing loop
     dou iExit = *on;
         WorkWithMessageDescriptions();
     enddo;

     return;

 /end-free
p Main            e

Note also that this Main procedure is not setting on the *LR indicator. This is because, in the header spec, I have a named main procedure:

h main(Main)

Using this tells the compiler that I am not using the RPG logic cycle and that processing should start with procedure Main. Not only does this allow me to abandon setting on *LR but also reduces a lot of the progeram’s overhead.

Using regular expressions in DB2 for i

Regular expressions have been supported in DB2 for a while now and, while they are not something I often need, there are times when they come in very handy indeed. A simple example involves cleaning up some data so that it can be exported from an old (poorly maintained) database to a new, shiny one that (I’m promised) absolutely will be used.

The database in question is an employee master and, in order to protect the innocent, I have both simplified it and changed the names for the purpose if this example. There are only two fields to worry about empnumber and empname, which represent the employee number and name respectively. There are a couple of issues to address. The first one is that the empnumber is ten characters in the old database but five numeric in the new, so I need to ensure that all the numbers fit.

The second issue is that this database has been very poorly maintained. If you allow someone to enter a character in a numeric field, they will. And they did.

Here’s an idea of how the database looked:

EMPNUMBER   EMPNAME
00001       Bugs Bunny
OOOO2       Elmer Fudd
0000000003  Daffy Duck
-4          Speedy Gonzales
00005       Sylvester

As you can see, there are lots of ways in which the employee number can be invalid, and these have all been very well explored. And watch out for Elmer Fudd — he’s a bit confused about the difference between letters and numbers.

Fortunately REGEXP_INSTR is your friend:

The REGEXP_INSTR returns the starting position or the position after the end of the matched substring, depending on the value of the return_option argument.

So all I need is a regular expression that identifies which employee numbers are five digits followed by spaces, and which aren’t. And here it is:

select * from employees
where regexp_instr(empnumber, '^\d{5}\s+$') != 1

There are several optional parameters but I only need to use the source string (empnumber) and the pattern expression. The expression I am using looks for digits in the first five characters followed by spaces to the end of the field. And this is the result:

EMPNUMBER   EMPNAME
OOOO2       Elmer Fudd
0000000003  Daffy Duck
-4          Speedy Gonzales

I told you that Elmer Fudd was tricksy.

It should also be possible to use REGEXP_REPLACE to automate some of the cleanup, but you really do need your business users to buy in before you go down that particular rabbit hole.

Merging data with SQL MERGE

There have been times, many times, when I have needed to update the one or more columns in a table with values retrieved from another. A traditional approach to this is to use an UPDATE with subselects but there is a better way:

The MERGE statement updates a target (a table or view) using data from a source (result of a table reference). Rows in the target that match the input data may be updated or deleted as specified, and rows that do not exist in the target may be inserted as specified. Updating, deleting, or inserting a row in a view updates, deletes, or inserts the row into the tables on which the view is based if no INSTEAD OF trigger is defined on the view.

In order to play around with this, I build and populated a couple of test tables (imaginatively named TEST01 and TEST02) using the below scripts:

create table test01
 (key_field char(1) not null,
  data01 char(20),
  data02 char(30),
  data03 char(5),
  primary key (key_field));

insert into test01
 (Key_field, data01)
values('A', 'TEXT One');

insert into test01
 (Key_field, data01)
values('B', 'TEXT Two');

insert into test01
 (Key_field, data01)
values('D', 'TEXT Four');
create table test02
 (key_field char(1) not null,
  data01 char(20),
  data02 char(30),
  data03 char(5),
  primary key (key_field));

insert into test02
 (Key_field, data02)
values('A', 'More text one');

insert into test02
 (Key_field, data02, data03)
values('C', 'more text two', 'CODE');

insert into test02
 (Key_field, data01, data02)
values('E', 'TEXT five', 'More text five');

This gives me two tables with one matched row (key_field=’A’). To update the first table with the values in the second is really rather simple:

MERGE INTO test01 a
  USING (SELECT key_field, data01, data02, data03 from test02) b
  ON b.key_field = a.key_field
WHEN MATCHED THEN
   UPDATE SET data02 = b.data02, data03 = b.data03;

And the one matched row is updated in table TEST01 with the data from table TEST02.

If this was all I could do, I would be happy to have replaced a rather clunky approach with a more readable solution. But now the MERGE statement really starts to make a difference:

MERGE INTO test01 a
  USING (SELECT key_field, data01, data02, data03 from test02) b
  ON b.key_field = a.key_field
WHEN MATCHED THEN
   UPDATE SET data02 = b.data02, data03 = b.data03
WHEN NOT MATCHED THEN
   INSERT (key_field, data01, data02, data03)
   VALUES (b.key_field, b.data01, b.data02, b.data03)
NOT ATOMIC
  CONTINUE ON SQLEXCEPTION;

The WHEN NOT MATCHED clause means that not only can I update the matched rows, I can also do something about the rows that exist in table TEST02 that have now matched record in table TEST01. In this example, I have inserted the rows from table TEST02, other options allow me to either delete the row from table TEST01 or raise a SIGNAL statement.

And that NOT ATOMIC CONTINUE ON SQLEXCEPTION specifies that if an error is encountered on the insert, the MERGE will continue. Bad data happens, this saves be the burden of constantly having to refine my selection criteria to avoid the crap.

And a final note on the subject of selection criteria, the WHEN MATCHED and WHEN NOT MATCHED can both be extended with an AND to refine the selection to exactly what you want.

Do go and read up on the MERGE statement. It provides a powerful and flexible approach to updating tables from source data and one that I will be using in the very near future.

Better date conversion with timestamp_format

Handling dates and times on the IBM i can often be a bit of a challenge. Obviously, if you are working with a modern application that uses the Date, Time and Timestamp data types everything is nice and straightforward. However, there are plenty of old applications that use eight, seven or six digit numeric fields to represent dates and, sooner or later, you will need to extract these in a manner that looks sane in some other environment.

Timestamp_format is your friend.

The TIMESTAMP_FORMAT function returns a timestamp that is based on the interpretation of the input string using the specified format.

It’s that “interpretation of the input string” that makes this so handy. Put simply: you tell the function how to interpret the string and it will do the rest.

Here’s an example. I have a contracts table (CONTRACTS) with three columns: Contract Number (CTNR), Start Date (STDT) and End Date (ENDT). And the start and end date are both seven digit numbers. I need to list all of the currently active contracts (Start Date is before Today and End Date is after Today).

And the SQL looks like this:

select CTNR,
       date(timestamp_format(substr(digits(STDT), 2, 6), 'YYMMDD')),
       date(timestamp_format(substr(digits(ENDT), 2, 6), 'YYMMDD'))
from CONTRACTS
where date(timestamp_format(substr(digits(STDT), 2, 6), 'YYMMDD')) <= current_date
and date(timestamp_format(substr(digits(ENDT), 2, 6), 'YYMMDD')) >= current_date

Current_date is a special register that returns the current date, but that’s not important right now.

The date conversion part involves several nested functions to get to a final date, so here’s the breakdown using a date of 1st September 2015 (1150901):

  • digits(STDT) converts the 7 digit number into a 7 character string. So 1150901 is converted to ‘1150901’.
  • The IBM i can figure out which century it’s in by looking at the year, so I don’t need the century flag and the substr(digits(STDT), 2, 6) strips it out to give me a value of ‘150901’
  • The timestamp_format function takes the date string and uses the format string of ‘YYMMDD’ to generate a timestamp of 2015-09-01-00.00.00.000000. Since I only passed it a date, the hours, minutes and seconds are all zeroes.
  • And finally, I can use the date function to retrieve the date from the generated timestamp.

And here’s the result:

CTNR      DATE      DATE
C1000001  01/09/15  31/08/16

Pretty, and portable.

Using EXTFILE to override files within RPG

So here’s the situation: Five warehouses populating five sets of files (same filenames, different libraries) and I have an RPGLE program that needs to read through each of these to accumulate dispatch information.

The traditional way of doing this would involve writing a CL program to OVRDBF to each file before calling the RPGLE program. This works, but it’s a bit messy and this can cause future maintenance problems. It’s much better, therefore, to specify the file (and library) to open within the RPGLE program itself. You can do this with the EXTFILE keyword.

Here’s an example:

 * ---------------------------------------------------------------------- *
 * Program     : LSX001R                                                  *
 * Description : Dynamically select which file to open                    *
 * ---------------------------------------------------------------------- *
h main(Main)
fMOVEMENTS if   e           k DISK    usropn extfile(filename)

d Main            pr                  extpgm('LSX001R')
d  library                      10a

dfilename         s             21a

 * ------------------------------------------------------------------------
 * Main Procedure
 * ------------------------------------------------------------------------
p Main            b
d Main            pi
d  library                      10a
 /free

     filename = %trim(library) + '/' + 'MOVEMENTS';
     open MOVEMENTS;

     // Do whatever processing on the file needs to be done...

     close MOVEMENTS;

 /end-free
p Main            e

Hopefully this is reasonably straightforward. In the file spec for the MOVEMENTS file, I have used EXTFILE with a variable filename to specify the actual file opened. I have also used the USROPN keyword as I need to populate the filename variable before I attempt to open the file.

Populating this field is pretty simple. I pass the library name to the program as a parameter and the first thing the Main procedure does is concatenate the library and file with a ‘/’ separator.

Now I can open the correct file, do whatever processing is necessary, and then close the file afterwards.

A similar approach, using the EXTMBR keyword can also be used to open a specific file member. You can, of course, combine EXTFILE and EXTMBR in order to dynamically determine both the file and member, if you really want to.

It should be noted that this will only work if you are using traditional database IO. Any embedded SQL will remain unaffected by anything you do in the F-Spec. If you want to do something similar with SQL, you will need to look into the CREATE ALIAS statement.

Adding variables to ad-hoc SQL queries with CL

Last month I mentioned using REXX to build and execute ad-hoc SQL queries. For the sake of completeness, here is the same solution implemented in CL using the RUNSQL command.

The program does exactly the same as the previously describe REXX script — extracts all records flagged with today’s date so that they can be reviewed at a later date. And here it is:

pgm
dcl &date6 *char 6
dcl &date8 *char 8
dcl &statement *char 255

/* First retrieve the current system date (QDATE)                             */
/* and convert it to an 8 character value                                     */
RTVSYSVAL  SYSVAL(QDATE) RTNVAR(&DATE6)
CVTDAT     DATE(&DATE6) TOVAR(&DATE8) FROMFMT(*JOB) TOFMT(*YYMD) TOSEP(*NONE)

/* Next, build the SQL statement                                              */
CHGVAR     VAR(&STATEMENT) +
           VALUE('insert into EXPATPAUL1/INPUTDATA' *bcat +
                 'select * from EXPATPAUL1/PRODDATA where dhdate =' *bcat +
                 &date8)
runsql sql(&statement) commit(*none)

endpgm

Using RGZPFM to sort physical files

It’s rare that you need to sort a database file (or table), but I have encountered a circumstance in which a sorted file had to be delivered. Being naturally lazy, I started looking around for the least-work method of achieving this and ended up looking at the RGZPFM command. This is a command I have used quite often to compress a physical file (it removes the deleted records), but this time around it was the reorganise part of Reorganise Physical File Member that I was interested in.

From the help text:

If a keyed file is identified in the Key file (KEYFILE) parameter, the system reorganizes the member by changing the physical sequence of the records in storage to either match the keyed sequence of the physical file member’s access path, or to match the access path of a logical file member that is defined over the physical file.

The important thing here is that any keyed file can be specified. So if I have an appropriately keyed logical file, I can very easily sort the physical.

Some names have been changed, but if I have a physical file called ITEMLIST and I want it sorted by field BARCODE, I need to first create an index:

create index ITEMLIST01          
on ITEMLIST (BARCODE);           
                                 
label on index ITEMLIST01        
is 'Logical on ITEMLIST (Keyed: BARCODE)'; 

Then I can easily reorder the physical file ITEMLIST into BARCODE sequence with the command:

RGZPFM FILE(ITEMLIST) KEYFILE(ITEMLIST01 ITEMLIST01)

The only thing to take not of is that RGZPFM will issue an error if the file has no records. The approach I took wast to use RTVMBRD to retrieve the NBRCURRCD for the member, an alternative approach would be to use a MONMSG immediately after the RGZPFM.

Debugging RPG Programs in Batch

If you spend any time developing RPG applications you will find that, sooner or later, you will need to debug a program that runs in batch. There are a few steps that you need to take in order to do this and, because my memory is terrible, I’m putting them here.

  • Know what job queue your job is being submitted to. Hold this job queue.
  • Submit your job.
  • Your job is now sitting on the held job queue waiting to be released. Display the job and make a note of the user name, job name and number.
  • Start a service job using STRSRVJOB entering the name, job name and number from the previous step.
  • STRDBG PGM(ProgramName). You can’t enter any breakpoints yet, so hit F12 to exit the source display
  • Release the job queue
  • The Start Serviced Job screen will be displayed asking you to press F10 to enter debug commands for the job. Press F10.
  • DSPMODSRC and enter your breakpoints.
  • Press F12 to resume and F3 to return to the Start Serviced Job screen.
  • Press Enter to start your job.

The job will now begin running and will stop, passing control back to your screen, when the first breakpoint is reached.

Service Programs and Call Stack APIs

I’m a big fan of service programs. From a maintainability point of view, encapsulated procedures are great. And exported procedures, which mean you only need to develop any piece of functionality once, are even better.

However, I now find myself in the position of having to start creating a set of these from scratch (nothing to copy/paste) and, since they can be a bit fiddly, now seems like a good time to document the steps.

In the example that follows, I create a service program with a single procedure, RtvProgram, which returns the name of the program that called the service program. It sounds a bit recursive, I know, but bear with me. Whenever coding a display file (or a report, for that matter) I like to put the program name somewhere on the screen (top left, unless someone tells me otherwise). This means that when someone has a problem, we can very quickly identify what program they are talking about.

Obviously, hard-coding the program name in the display file would be easy. But code gets copied and pasted and, sooner or later, all hard coded values end up being wrong. I have also seen cases of one display file being used by two or more programs, so it is much better to put the program name in a field and retrieve it dynamically. That way, you are always getting the right program.

So on to the service program.

First, you need a copy member to contain the procedure prototype. You could, of course, code the prototype in each program that uses the service program, but that way madness lies.

In my case, I have created the copy member in QRPGLESRC and called it LSS001RP. It looks like this:

 * Retrieve Program Name
d RtvProgram      pr            10a   

And then you need to write the service program:

 * ---------------------------------------------------------------------- *
 * Program     : LSS001R                                                  *
 * Description : Program information service programs                     *
 * ---------------------------------------------------------------------- *
h nomain

 * ---------------------------------------------------------------------- *
 * Exportable Prototypes                                                  *
 * ---------------------------------------------------------------------- *
 /copy LSCLIB/qrpglesrc,lss001rp

 * ---------------------------------------------------------------------- *
 * API Prototypes                                                         *
 * ---------------------------------------------------------------------- *
d RtvCallStack    pr                  extpgm('QWVRCSTK')
d                             2000a
d                               10i 0
d                                8a
d                               56a
d                                8a
d                               15a

 * ---------------------------------------------------------------------- *
 * RtvProgram: Retrieve the program name                                  *
 * ---------------------------------------------------------------------- *
p RtvProgram      b                   export
d RtvProgram      pi            10a

d Var             ds          2000    qualified
d  BytAvl                       10i 0
d  BytRtn                       10i 0
d  Entries                      10i 0
d  Offset                       10i 0
d  Count                        10i 0

d JobID           ds                  qualified
d  QName                        26a   inz('*')
d  IntID                        16a
d  Res3                          2a   inz(*loval)
d  ThreadInd                    10i 0 inz(1)
d  Thread                        8a   inz(*loval)

d Entry           ds                  qualified
d  Length                       10i 0
d  Program                      10a   overlay(Entry: 25)
d  Library                      10a   overlay(Entry: 35)

d VarLength       s             10i 0 inz(%size(Var))
d RcvFormat       s              8a   inz('CSTK0100')
d JobIdFmt        s              8a   inz('JIDF0100')
d ApiError        s             15a
d i               s             10i 0
 /free

     RtvCallStack(Var: VarLength: RcvFormat: JobID: JobIdFmt : ApiError);
     for i = 1 to 2;
         Entry = %subst(Var: Var.Offset + 1);
         Var.Offset += Entry.Length;
     endfor;

     return Entry.Program;

 /end-free
p RtvProgram      e
 * ---------------------------------------------------------------------- * 

I’m not going to go into too much detail here. The service program LSS001R contains one procedure, RtvProgram which uses the QWVRCSTK API to retrieve the current call stack then it reads back two entries: The first entry is the service program and the second entry is the calling program. And this is the program name that it returns.

You now need to create the RPG Module. Note the terminology here — you are not creating a Bound RPG Program (it’s the difference between options 15 and 14 in PDM).

I also need the binding source. In this case, the member is called LSS001S and I have put it in the QSRVSRC source file. It looks like this:

STRPGMEXP  PGMLVL(*CURRENT)
    EXPORT SYMBOL('RTVPROGRAM')
ENDPGMEXP

Note that the capitalisation is actually important here.

And now I’m ready to create the service program:

CRTSRVPGM SRVPGM(LSCLIB/LSS001R) MODULE(LSCLIB/LSS001R) SRCFILE(LSCLIB/QSRVSRC) SRCMBR(LSS001S)

Since I’m doing this from scratch, I need to create a binder directory:

CRTBNDDIR BNDDIR(LSCLIB/LSBNDDIR) TEXT('General purpose binding directory')

And add the service program to it:

ADDBNDDIRE BNDDIR(LSCLIB/LSBNDDIR) OBJ((LSS001R))

And we’re ready to go. All I have to do now is make a couple of amendments to the main program to take advantage of the service program:

The control spec needs this line:

h bnddir('LSBNDDIR')                                                       

Obviously, I need to copy the prototype definition somewhere in the definition specification:

 /copy LSCLIB/qrpglesrc,lss001rp                                       

And when the program starts, I need to identify the name of the program:

 /free

     // Identify the current program
     program = RtvProgram();
                                                                           

And that’s it.