KB Article #176196

How to get Interchange to throw an error when a database deadlocks

Problem


If there is a database deadlock Interchange will block completely and no errors will be thrown in the Interchange logs files.

Resolution

How to show error information in the logs files:
1. Open the <Interchange>/conf/datastoreconfig.xml
2. Identify your database profile. Look for a line similar to this <ConnectionInfoRef>DB2</ConnectionInfoRef>. In this case it's DB2

DB2:
3. Go to the line <ConnectionInfo id="DB2">
4. Modify the JDBC url. Look a line similar to <ConnectionURL>jdbc:db2://{Host}:{Port}/{DatabaseName}</ConnectionURL>. Add to the end of the url this ":blockingReadConnectionTimeout=300;". Where 300 is the numbers of seconds to wait until throwing an error.
The line should now look like this <ConnectionURL>jdbc:db2://{Host}:{Port}/{DatabaseName}:blockingReadConnectionTimeout=300;</ConnectionURL>

MySQL/Axway (Synchrony) Database:
3. Go to the line <ConnectionInfo id="MYSQL"> or <ConnectionInfo id="Synchrony Database">
4. Modify the JDBC url. Look a line similat to <ConnectionURL>jdbc:mysql://{Host}:{Port}/{DatabaseName}?zeroDateTimeBehavior=convertToNull</ConnectionURL>. Add to the end of the url this "&amp;connectTimeout=300000&amp;socketTimeout=300000". Where 300000 is the numbers of miliseconds to wait until throwing an error.
The line should now look like this  <ConnectionURL>jdbc:mysql://{Host}:{Port}/{DatabaseName}?zeroDateTimeBehavior=convertToNull&amp;connectTimeout=300000&amp;socketTimeout=300000</ConnectionURL>

5. Restart Interchange

Note: If in cluster mode, this needs to be done on all nodes

Notes2: To check if this parameter is working with your JDBC/Database Driver.
Start Interchange and let it finish it's startup process.
Identify the database process (on the database machine) and then halt the database process for a few more seconds then the timeout you set. After resume the process. Make sure you pause the right process and do this only on a test environment.
On linux:
Pause the process: kill -STOP <database pid>
Resume the process: kill -CONT <database pid>

On Windows
Pause the process: with Process Explorer, right click on the process -> Suspend
Resume the process: With Process Explorer, right click on the process -> Resume

You should see the following error(s) in the Interchange cn logs:

Important: When choosing the timeout, don't choose a value that is too low, as you could have queries that take longer to execute to be falsely reported as timeouts. A safe value would be around 5 minutes (300 seconds). If a query takes longer than 5 min to execute it indicates there could be another problem that needs attention.


DB2:
2014-06-24 18:06:20,756 - WARN  [HeartbeatUpdater] (HeartbeatUpdater.updateHeartbeat:100) - Error updating heart beat
com.ibm.db2.jcc.am.yn: [jcc][t4][2030][11211][3.57.82] A communication error occurred during operations on the connection's underlying socket, socket input stream,
or socket output stream.  Error location: Reply.fill().  Message: Read timed out. ERRORCODE=-4499, SQLSTATE=08001
        at com.ibm.db2.jcc.am.bd.a(bd.java:319)
        at com.ibm.db2.jcc.t4.a.a(a.java:365)
        at com.ibm.db2.jcc.t4.a.a(a.java:360)
        at com.ibm.db2.jcc.t4.ab.b(ab.java:202)
        at com.ibm.db2.jcc.t4.ab.c(ab.java:243)
        at com.ibm.db2.jcc.t4.ab.c(ab.java:354)
        at com.ibm.db2.jcc.t4.ab.v(ab.java:1139)
        at com.ibm.db2.jcc.t4.db.a(db.java:38)
        at com.ibm.db2.jcc.t4.t.a(t.java:32)
        at com.ibm.db2.jcc.t4.sb.h(sb.java:135)
        at com.ibm.db2.jcc.am.km.eb(km.java:1934)
        at com.ibm.db2.jcc.am.lm.ic(lm.java:2798)
        at com.ibm.db2.jcc.am.lm.b(lm.java:3517)
        at com.ibm.db2.jcc.am.lm.Wb(lm.java:690)
        at com.ibm.db2.jcc.am.lm.executeUpdate(lm.java:673)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.doHeartbeatUpdate(HeartbeatUpdater.java:168)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.updateHeartbeat(HeartbeatUpdater.java:95)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.run(HeartbeatUpdater.java:70)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(Unknown Source)
        at com.ibm.db2.jcc.t4.ab.b(ab.java:195)
        ... 14 more
2014-06-24 18:06:26,278 - ERROR [Thread-38] (EventedThread.run:90) - EventedThread error.  Stopping.
kodo.util.DataStoreException: [jcc][t4][2030][11211][3.57.82] A communication error occurred during operations on the connection's underlying socket, socket input stream,
or socket output stream.  Error location: Reply.fill().  Message: Read timed out. ERRORCODE=-4499, SQLSTATE=08001 {prepstmnt 167515798 SELECT t0.OID, t0.JDOVERSION, t0.CONGESTIONCHECK,
        at kodo.jdbc.sql.DBDictionary.newDataStoreException(DBDictionary.java:3403)
        at kodo.jdbc.sql.SQLExceptions.getDataStore(SQLExceptions.java:77)
        at kodo.jdbc.sql.SQLExceptions.getDataStore(SQLExceptions.java:63)


MySQL / Axway (Synchrony) Database:
2014-06-20 19:55:06,291 - WARN  [HeartbeatUpdater] (HeartbeatUpdater.updateHeartbeat:100) - Error updating heart beat
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 42,047 milliseconds ago.  The last packet sent successfully to the server was 40,046 milliseconds ago.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1118)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3055)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2941)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3489)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2562)
        at com.mysql.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:4956)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.doHeartbeatUpdate(HeartbeatUpdater.java:161)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.updateHeartbeat(HeartbeatUpdater.java:95)
        at com.axway.cluster.bus.membership.HeartbeatUpdater.run(HeartbeatUpdater.java:70)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(Unknown Source)
        at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
        at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
        at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2499)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2952)
        ... 9 more


When the connection is restored you should see something like this in the logs
2014-06-20 19:55:17,653 - INFO  [HeartbeatUpdater] (HeartbeatUpdater.updateHeartbeat:131) - Recovered from heart beat update error


Extra info about the JDBC parameters:
DB2:
http://www-01.ibm.com/support/knowledgecenter/?lang=en#!/SSEPGG_9.1.0/com.ibm.db2.udb.apdv.java.doc/doc/rjvdsprp.htm
http://www-01.ibm.com/support/knowledgecenter/api/content/SSEPGG_9.1.0/com.ibm.db2.udb.apdv.java.doc/doc/rjvdsprp.htm?locale=en&ro=kcUI
MySQL:
http://dev.mysql.com/doc/connector-j/en/connector-j-reference-configuration-properties.html