KB Article #178019

ENTERPRISE CLUSTER: Configure Coherence unicast discovery to use a specific IP address

Problem

SecureTransport ships with Coherence cluster implementation, configured for unicast discovery. The well-known addresses list is populated at runtime, based on the environment-specific network configuration.


If for some reason the well-known addresses list is not populated, or the information in it is not correct, the Coherence cluster will not form, admind will not start, and the following error will be logged in the $FILEDRIVEHOME/tomcat/admin/log/catalina.out file:


com.tangosol.net.RequestTimeoutException: Timeout during service start


Resolution

In most cases, these symptoms can be remediated by reconfiguring the network environment: the machine hostname should be resolvable to a valid IP address. However, if for some reason this cannot be achieved, we can reconfigure Coherence to use a hardcoded list of well-known addresses.


Disclaimer: The procedure below is devised to work around certain network configuration restrictions, and is provided "as is".


Steps to reconfigure Coherence:


1. Backup and edit the $FILEDRIVEHOME/conf/tangosol-coherence-override.xml file.


2. Replace "localhost" with the actual IP address of the server. In other words, under the <unicast-listener> tag remove the lines


    <address>localhost</address>
    <port>8088</port>


and replace them with:


    <address>IP_ADDRESS_OF_THE_SERVER</address>
    <port>8088</port>


where IP_ADDRESS_OF_THE_SERVER is the correct IP address of the current cluster node.


3. Replace the discovery class with specific hosts. Under the <unicast-listener> tag remove the lines


<well-known-addresses>
    <address-provider>
        <class-name>com.tumbleweed.st.server.cluster.coherence.conf.StRefreshableAddressProvider</class-name> 
    </address-provider>
</well-known-addresses>


and replace them with:


<well-known-addresses>
    <socket-address id="1">
        <address>IP_ADDRESS_OF_THE_SERVER</address>
        <port>PORT_OF_THE_SERVER</port>
    </socket-address>
</well-known-addresses>


where IP_ADDRESS_OF_THE_SERVER is the correct IP address of the current cluster node, and PORT_OF_THE_SERVER is the correct port of the server (by default 8088).


If the server is part of cluster, we need to define all cluster nodes in the well known addresses list on each node. Also, nodes should be able to reach each other on the defined IP addresses and ports.


When defining multiple cluster nodes, the list of well-known-addresses must be the same for every cluster member to ensure that different cluster members do not operate independently from the rest of the cluster


Example for Node 1:


<well-known-addresses>
    <socket-address id="1">
        <address>IP_ADDRESS_OF_NODE_1</address>
        <port>PORT_OF_NODE_1</port>
    </socket-address>
    <socket-address id="2">
        <address>IP_ADDRESS_OF_NODE_2</address>
        <port>PORT_OF_NODE_2</port>
    </socket-address>
</well-known-addresses>


Example for Node 2:


<well-known-addresses>
    <socket-address id="1">
        <address>IP_ADDRESS_OF_NODE_1</address>
        <port>PORT_OF_NODE_1</port>
    </socket-address>
    <socket-address id="2">
        <address>IP_ADDRESS_OF_NODE_2</address>
        <port>PORT_OF_NODE_2</port>
    </socket-address>
</well-known-addresses>


where IP_ADDRESS_OF_NODE_1 and IP_ADDRESS_OF_NODE_2 are the correct IP addresses of Node 1 and Node 2 of the cluster.


4. Repeat these steps on each node, and restart the admin and the TM services.