KB Article #170407

Two JOB and two commands OK but only one has been performed

Problem

-- One command lost

-- 2 commands with same CFTCOM row number



Resolution


CFTSHARE need to be set in MIM / GRS as per the documentation:

GRS Multi-System Protection by CFT/MVS
CFT uses the CFTFILES QNAME to protect the files transferred via ENQ
The CFTFILES ENQs do not need to be broadcast to all systems in the GRS RING
In a multi-CPU or SYSPLEX configuration, you need only broadcast the CFTSHARE QNAME to all systems in the GRS RING.

Explanation:

When an issue occurs on a system or if the monitor is forced kill, you can get a command registered without the 'next to read' command indicator updated in the header.

The UCONF variables below ensure we are rescanning the com file on a regular basis to automatically fixe that condition.

cft.server.com.rescan.enable
cft.server.com.rescan.wscan_factor


These variables allow CFT to re-compute 'the next to read' command indicator in the com file header each (CFTCOM WSCAN) x (CFT.SERVER.COM.RESCAN_FACTOR)

--important note:

When issue is reported without being related to a side issue on the system, it has been proven in ALL cases reported so far that it is related to the bad management of file locks (ENQs) and directly related to the File System and mount options used when open systems or the fact that QNAME CFTSHARE has not been propagated on all GRS rings on the z/os systems.

All write operations to the communication file MUST be serialized and the mandatory serialization rely on the system ENQ or lock on the file.

--Therefore, when a reported case ends in turning on the RESCAN option of the communication file, it is important to verify above informations about the files locks and revert back to off the rescan option accordingly when the root cause of the issue has been identified and fixed.

-When a cluster installation is concerned, keeping the rescan feature turned on should be considered (it helps in case the com file get corrupted after a node crashed)