KB Article #67885
UNIX: When are inbound files available for post-processing?
* Since our backend processes will continually be polling Interchange's ediin, xmlin, and binaryin directories, how can we know when incoming files are fully 'ready' to be grabbed?
* If a large file takes 10 seconds to be fully written into the in directory, but we poll that directory on the 3rd second, see it there, wouldn't we then grab an incomplete file out of that directory?
* What can prevent this from happening?
Resolution
* When Interchange receives an inbound file, that file is first of all placed into the backup directory where it is verified, and unpackaged. Then the payload file is then placed into the appropriate ediin, xmlin, or binaryin directory via the UNIX mv (move) command. This transfer (between the backup and the in directories) occurs instantaneously since mv simply moves a file pointer from one directory to the other. Therefore, when the file "appears" in the ediin, xmlin, or binary/in directory your backend processes can safely grab it.
* If you placed Interchange's backup directory on one filesystem and the ediin, xmlin, and binaryin directories onto another fs (which is not recommended), then mv sometimes cannot hop the barrier between one UNIX filesystem to another fs. Interchange will always attempt to use the mv command first but if mv fails, Interchange will then use a cp/rm (copy/delete) model to accomplish the file transfer. It is in this copying process across filesystems that the scenario could occur where your backend grabs an incomplete file.
* If you followed the default Interchange installation then all of Interchange's directories are on the same fs.
Else:
-- Populate the Post-Process field with the name of a shell script that will do the following. Your script should take the latest file that appears in the in directory and invoke the mv command (instead of cp) to place that file immediately into yet another directory that resides within the same fs as the in directory does.
-- If you want to build in a further margin of safety, your script can "sleep" for a few seconds after the mv step, and then notify the backend to come fetch the new arrival.
-- The shell script needs to be "owned" by a non-root user with chmod permissions of at least 500 or better set on it.
-- Advantage of using Interchange's Post-Process feature is that it's invoked for each and every file that comes in, but not until each file is fully and safely deposited into the appropriate ediin, xmlin, or binaryin directory.