Friday, June 13, 2008

Solaris Kernel SCTP Issue

We recently we came across an interesting issue with Solaris Kernel SCTP. We observed that if you write a SCTP server program and then initiate thousands of connections from a SCTP client program loop, the server is able to establish the SCTP associations. The point to be noted here is that to simulate thousands of connections from client, the client program loops, creates a socket, binds to the client machine's IP address and a different port number each time. With this setup we were able to create thousands of associations towards the server. But when tested in a live setup where thousands of client machines (each a different machine and hence a different IP address) try to connect to the SCTP server, the solaris kernel sctp goes into a high CPU utilization of near 100%. Interestingly all the clients were bound to the same port number. If we change the port numbers used by the client machines, the CPU utilization @ server comes down to normal.

My hunch here is that kernel SCTP's hashing mechanism to index the hash table of SCTP associations is using only the remote client's port number part to generate the hash value and not the remote port + IP address combo. So if all clients are bound to the same port number, then the hashing algo will generate the same hash value. This will lead to all the client connections colliding to the same hash value thus effectively reducing the search to O(n) or O(logn) based on the collision resolution algo being used.

Refer this for the complaint.
http://forum.java.sun.com/thread.jspa?threadID=5288890

No comments: