KB Article #193302

Recover transactions in Business Insights and API Analytics lost due to unprocessed events

Problem

There have been use cases when events could not be processed or became corrupted. This can lead to missing transactions in Business Insights/Organization Usage or API Analytics, depending on the case. Since the Traceability agent is responsible for metrics and usage in Business Insights & Organization Usage, and the Admin Node Manager handles event processing for API Analytics, then I will address each individually below.

Resolution

This KB Article will target use cases when events became unable to be processed, and will address how to re-process them.

  1. Traceability Agent

  2. It is important to know the path towards which the EVENT_LOG_PATHS points in the Traceability Agent configuration. It is generally set to /apigateway/events path, however in some cases it can also be /apigateway/events/processed. In the next part I will use the /apigateway/events path.

    In cases when events have not been processed, if you want the agent to re-process these events then you can add them again in the /apigateway/events folder, like duplicates. As long as they are not corrupted or as long as the events have been sanitized, the agent will read them again and process them as new events.

    Generally, when the agent is stopped, it does not read event logs older than an hour. The agent keeps track of the system timestamp of the events files, NOT the timestamp in the event name or the epoch timestamp in the logs. So if your agent was stopped longer than an hour and you lost some events, you can add them again to the events directory and the agent will process them. However, it is important to know that the transactions from the events will be reflected in Business Insights and Organization Usage at the time they were processed correctly by the agent.

    For a better understanding of the situation, we can describe this flow using an example:

  • The TA stops at 1:00 PM and does not start again until 3:00 PM.
  • Once the agent starts again, it will automatically read events from 2:00 PM to 3:00 PM. But the events from 1:00 PM to 2:00 PM will not be read since they're older than 1h. In Business Insights this will be reflected as an empty graph between 1:00 to 2:00 PM.
  • However, say you decide to add the unread event logs back into the events directory at 5:00 PM. The agent processes these events and they will be reflected in Business Insights as metrics starting with 5:00 PM. They will NOT be reflected in Business Insights at the time they originally happened (which is between 1:00 PM and 2:00 PM), but at the time they were correctly processed by the agent (starting with 5:00 PM).
  • API Analytics
  • In the case of corrupted events, API Analytics can also be affected and usage would be missing as well. The same method can be used to recover the events for the API Analytics. This is where the difference in the paths /events and /events/processed is important.

    If you are using the path /apigateway/events/processed for the TA events path, you are probably using /apigateway/events for API Analytics. In this case, if you want to recover the events for both, then you can add the sanitized events directly to the /apigateway/events folder. From here, they'll be processed by the Admin Node Manager, reflected in API Analytics, and sent to /apigateway/processed from where the Traceability Agent will read them.