KB Article #183116
Monitoring the Event Queue with SQL queries
Problem
Server initiated transfers in ST are coordinated by events in the Event table in the database. In the past most of the valuable information was hidden in two binary large object fields - data
and sessionData
- which are difficult to use with standard SQL queries in order to aggregate the events for monitoring purpose.
In SecureTransport 5.5-20220127 a new feature was introduced for events queue management, which brings changes to the Event table to help with the better aggregation of the events. The feature works only via the REST API 2.0 resource /events
. In ST 5.5-20231130 the feature was further extended by creating a dedicated page in Admin UI. Originally designed to handle individual events, it has no ordering (sorting) or grouping capabilities yet, but only filter and delete capabilities. This works for the management of events, but is not sufficient for monitoring.
Resolution
For monitoring purposes an SQL query is the best option for now until the Events Queue Management is extended for monitoring. The SQL query has a different syntax for different supported database types (MariaDB, PostgreSQL, Oracle, MSSQL) due to the different way of converting the timestamps to a human readable format, and of extracting the UPLOAD_FOLDER
from FULLTARGET
. Here are versions of the SQL query for Linux paths.
Note that this article applies only for Server Initiated Transfers (SITs). Client Initiated Transfers (CITs) do not insert any events in the event queue. Any CIT became SIT when file arrive to subscription folder.
MariaDB
SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, left(fulltarget,char_length(fulltarget)-char_length(substring_index(fulltarget,'/',-1))) as targetFolder, Account.name as accountName, FROM_UNIXTIME(MIN(arrivalTime)/1000) as oldestArrivalTime, Subscription.folder as subscriptionFolder FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, left(fulltarget,char_length(fulltarget)-char_length(substring_index(fulltarget,'/',-1))), Account.name, Subscription.folder ORDER BY 8 ASC, 2 DESC;
PostgreSQL
SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, LEFT(fulltarget, LENGTH(fulltarget) - POSITION('/' in REVERSE(fulltarget))) as targetFolder, Account.name as accountName, to_timestamp(TRUNC(MIN(event.arrivaltime)/1000)) as oldestArrivalTime, Subscription.folder as subscriptionFolder FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, LEFT(fulltarget, LENGTH(fulltarget) - POSITION('/' in REVERSE(fulltarget))), Account.name, Subscription.folder ORDER BY 8 ASC, 2 DESC;
Oracle
SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, substr(fulltarget,1, instr(fulltarget,'/',-1)) as targetFolder, Account.name as accountName, TO_CHAR(DATE '1970-01-01' + ( 1 / 24 / 60 / 60 / 1000) * MIN(arrivaltime), 'YYYY-MM-DD HH24:MM:SS') as oldestArrivalTime, Subscription.folder as subscriptionFolder FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, substr(fulltarget,1, instr(fulltarget,'/',-1)), Account.name, Subscription.folder ORDER BY 8 ASC, 2 DESC;
MSSQL
SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, LEFT(fulltarget, LEN(fulltarget) - CHARINDEX('/', REVERSE(fulltarget), 0)) as targetFolder, Account.name as accountName, DATEADD(second, cast(MIN(arrivalTime)/1000 as int), convert(datetime, '01/01/1970', 101)) as oldestArrivalTime, Subscription.folder as subscriptionFolder FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, LEFT(fulltarget, LEN(fulltarget) - CHARINDEX('/', REVERSE(fulltarget), 0)), Account.name, Subscription.folder ORDER BY 8 ASC, 2 DESC;
Comments:
• The clusternode
field is populated on LEC cluster only and contains the IP address of the relevant server.
• The status
field can have the following values:
0 – Ready. The event is in the queue and waiting for some condition/resources to be executed.
1 – Active. The event has assigned thread, and it is in progress.
2 – Recovering. Upon TM restart events are recovered for reprocessing.
Under normal conditions only active (1) and ready (0) statuses are present.
• The eventgroup
field represents a subscription and can have a value in the following format:
• For SIT pull <subscriptionId>_subscription_PARTNER-IN
• For SIT push <subscriptionId>
.
• REST API events and some PeSIT events do not have eventgroup
assigned.
• The retryCount
field is applicable only for the Basic Application retry mechanism.
• The targetFolder
represents the full system path where the file resides for pull or push. It could be any subscription folder or subfolder of subscription folder under the account home folder. It is extracted from fulltarget
by removing the file name. For Linux the query uses "/" as path delimiter. For Windows this must be changed to "\\".
• If you remove the extraction of filename
and use the fulltarget
field in the SELECT and GROUP BY clauses directly, this effectively removes the grouping and shows individual events.
• The oldestArrivalTime
is calculated from the time when events were generated and inserted in the Event table. This time never changes until the event is processed and removed from the Event table. The value represents the earliest (oldest) event in the grouped events list.
• The SQL query sorts the result by oldestArrivalTime
(oldest on top) and then by number of events (higher number of events first).
Sample output of the SQL query
Even better solution
The above SQL query shows all events in the Event table at the time of execution. The Event Queue, however, is very dynamic and events will be created and removed very quickly. When the events stay longer in the queue, they become a subject of interest. The above SQL query can be enhanced to filter only events older than a specific number of seconds (calculated based on the current DB server time). Here are versions of the enhanced SQL query showing events older than a minute (60 seconds).
MariaDB
SELECT e.*, TRUNCATE((UNIX_TIMESTAMP() - e.oldestArrivalUnixTime) / 86400, 0) || ' days ' || DATE_FORMAT('1970-01-01 00:00:00' + INTERVAL UNIX_TIMESTAMP() - e.oldestArrivalUnixTime SECOND, '%T') AS arrivedSince FROM ( SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, left(fulltarget,char_length(fulltarget)-char_length(substring_index(fulltarget,'/',-1))) as targetFolder, Account.name as accountName, FROM_UNIXTIME(MIN(arrivalTime)/1000) as oldestArrivalTime, Subscription.folder as subscriptionFolder, MIN(arrivalTime)/1000 AS oldestArrivalUnixTime FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, left(fulltarget,char_length(fulltarget)-char_length(substring_index(fulltarget,'/',-1))), Account.name, Subscription.folder ) e WHERE (UNIX_TIMESTAMP() - e.oldestArrivalUnixTime) > 60 ORDER BY 8 ASC, 2 DESC;
PostgreSQL
SELECT e.*, TO_CHAR(justify_hours(interval '1 sec' * (CAST(EXTRACT(epoch FROM NOW()) AS INT) - e.oldestArrivalUnixTime)), 'DD" days "HH24:MI:SS') AS arrivedSince FROM ( SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, LEFT(fulltarget, LENGTH(fulltarget) - POSITION('/' in REVERSE(fulltarget))) as targetFolder, Account.name as accountName, to_timestamp(TRUNC(MIN(arrivaltime)/1000)) as oldestArrivalTime, Subscription.folder as subscriptionFolder, MIN(arrivalTime)/1000 AS oldestArrivalUnixTime FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, LEFT(fulltarget, LENGTH(fulltarget) - POSITION('/' in REVERSE(fulltarget))), Account.name, Subscription.folder ) e WHERE (CAST(EXTRACT(epoch FROM NOW()) AS INT) - e.oldestArrivalUnixTime) > 60 ORDER BY 8 ASC, 2 DESC;
Oracle
SELECT e.*, trunc(trunc(sysdate) + numtodsinterval(CAST((CAST(SYS_EXTRACT_UTC(SYSTIMESTAMP) AS DATE) - DATE'1970-01-01') * 86400 AS INTEGER) - e.oldestArrivalUnixTime, 'second')) - trunc(sysdate) || ' days ' || to_char(trunc(sysdate) + numtodsinterval(CAST((CAST(SYS_EXTRACT_UTC(SYSTIMESTAMP) AS DATE) - DATE'1970-01-01') * 86400 AS INTEGER) - e.oldestArrivalUnixTime, 'second'), 'hh24:mi:ss') AS arrivedSince FROM ( SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, substr(fulltarget,1, instr(fulltarget,'/',-1)) as targetFolder, Account.name as accountName, TO_CHAR(DATE '1970-01-01' + ( 1 / 24 / 60 / 60 / 1000) * MIN(arrivaltime), 'YYYY-MM-DD HH24:MM:SS') as oldestArrivalTime, Subscription.folder as subscriptionFolder, MIN(arrivaltime / 1000) as oldestArrivalUnixTime FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, substr(fulltarget,1, instr(fulltarget,'/',-1)), Account.name, Subscription.folder ) e WHERE (CAST((CAST(SYS_EXTRACT_UTC(SYSTIMESTAMP) AS DATE) - DATE'1970-01-01') * 86400 AS INTEGER) - e.oldestArrivalUnixTime) > 60 ORDER BY 8 ASC, 2 DESC;
MSSQL
SELECT e.*, CAST(FLOOR((DATEDIFF(s, '1970-01-01', GETUTCDATE()) - e.oldestArrivalUnixTime) / 86400) AS VARCHAR(10)) + ' days ' + CONVERT(VARCHAR(8), DATEADD(SECOND, (DATEDIFF(s, '1970-01-01', GETUTCDATE()) - e.oldestArrivalUnixTime), '19000101'), 8) AS arrivedSince FROM ( SELECT clusternode, count(status) as numberOfEvents, status, eventgroup, retryCount, LEFT(fulltarget, LEN(fulltarget) - CHARINDEX('/', REVERSE(fulltarget), 0)) as targetFolder, Account.name as accountName, DATEADD(second, cast(MIN(arrivalTime)/1000 as int), convert(datetime, '01/01/1970', 101)) as oldestArrivalTime, Subscription.folder as subscriptionFolder, MIN(arrivalTime)/1000 AS oldestArrivalUnixTime FROM Event LEFT JOIN Account ON accountid=Account.id LEFT JOIN Subscription ON subscriptionid=Subscription.id GROUP BY clusternode, status, eventgroup, retryCount, LEFT(fulltarget, LEN(fulltarget) - CHARINDEX('/', REVERSE(fulltarget), 0)), Account.name, Subscription.folder ) e WHERE (DATEDIFF(s, '1970-01-01', GETUTCDATE()) - e.oldestArrivalUnixTime) > 60 ORDER BY 8 ASC, 2 DESC;
Comments:
• Increasing the value 60 in the WHERE clause to something like 3600 (1 hour) helps to filter stuck subscriptions and slow transfers of huge files.
• The first line has a complex expression to convert the time difference between the current time and the event's arrival time to a human readable format. You can use the expression from the WHERE clause in its place to get the difference in seconds.
Sample output of the enhanced SQL query
I hope you enjoy using SecureTransport 5.5!
Evgeni Evangelov