Home > Java, SIP, Telecom > Evaluation of Sailfin

Evaluation of Sailfin

This article contains the result of my evaluation of SIP Servlet container implementation named SailFin.

Just a little bit of background. SailFin is an open-source SIP Servlet container hosted by java.net project. Big parts of it were written by Ericsson. SailFin is not a stand-alone container, it is build on top of Glassfish.

Using Sailfin

If you want to play with SailFin yourself, then you should download a binary package. I’ve used “milestone 4”. The file which you download is a “unpacker”, which will unpack a large directory structure, containing Glassfish with Sailfin. Second thing you should do is to run “setup.xml” script using pre-bundled ant. I had problems at this stage, but they were solved then I removed ANT_HOME environment variable and also removed my existing ant from CLASSPATH. This “setup.xml” script creates one “domain”, which is a entity administered by instance of Glassfish. Domain is represented by folder, and all configuration, deployment and logging is stored there. Then I’ve started Glassfish, and connected to its management console through HTTP. All was working.

Next I’ve tried to use some SIP servlet. A distribution contains a directory with samples. However, for my purposes those samples are too complex. It seems that whose samples are designed to show benefits of SIP-JEE integration, which was not interesting for me at moment. So, I’ve written my own servlet which does simple proxying. Samples have build.xml scripts for ant which do everything including deployment, but those scripts refer to other scripts, which in turn refer to other scripts, thus making it diffucult for me to figure out which properties I should set. Instead of hacking those scripts I’ve made a simple script myself just for compilation and packaging, and deployed my servlet manually by copying it into auto-deployment directory. Then I’ve checked through management console that my servlet was succesfully started.

I have a habit of testing intermediate SIP nodes by calling through them from my softphone to media server, and if I can hear a recorded announcement then I assume that node is working. Unfortunatelly, attempt to call to media server through my servlet has failed. As a developer of server software, I’ve tried to find logs which whould contain any clues. The only one I’ve found was called “server.log”, and it didn’t contain anything useful. I thought “OK, maybe I need to set more verbose log level”. After a long seach if management console I’ve finally found a place where I could specify correct log level. At the bottom of the page “Application Server/Logging/Log levels” there is a small table “additional properties”. Length of text fields is small so you can’t see whole property names without scrolling through text field with cursor keys (or property names too long), that’s why it took so long for me to find it. One of those “additional properties” is called “javax.enterprise.system.container.sip”, which corresponds to log level of SIP logger. Default value for this property is “INFO”, which I’ve changed to “FINE” and pressed “save”. Then I’ve made another call, and at this time I’ve got a much more verbose log. Now I could see that request was received, but an exception was thrown while parsing Request-URI:

ReqUri = sip:annc@10.50.3.83:5060;early=no;
play=file:////opt/snowshore/prompts/generic/en_US/try_again.wav

javax.servlet.sip.ServletParseException: Unexpected exception while parsing URI: java.lang.StringIndexOutOfBoundsException: String index out of range: -20
        at com.ericsson.ssa.sip.SipURIImpl.<init>(SipURIImpl.java:84)

It seems that SIP URI of media server is probably too complex for Sailfin, so I’ve tried to call to another softphone instead. This time an attempt was succesfull. I’ve changed log level back to INFO, and called to softphone again. There were nothing about new call in logs. I think that it is wrong to log errors on “FINE” severity.

Another little annoyance was that SIP ports shown in configuration of “SIP Container” were not actual values used by SIP stack. Instead, there is another configurable entity called “SIP Service”, which has “SIP Listeners”, actually used for processing incoming request.

Reading source code

Source code bundle contains sources for both Glassfish and Sailfin. Whole size of sailfin is 780 classes, which in my impression is quite a lot, so I suspect it is a little bit overengineered. For source code analysis I’ve applied my usual technique: find some class which purpose I understand, then backtrack to “main” class which manages whole stack (“the origin”). I’ve quickly discovered classes which do parsing and network I/O, then tried to backtrack using “Find usages” feature of my IDE. Unfortunatelly, this was not working, because I couldn’t find references which make sence. After some hacking through code I’ve understood the reason of failure: the code uses reflection. Anyway, soon I’ve understood the whole architecture.

Container startup

Starting point of stack is located in module sailfin/integration. There are two interesting classes here: SipContainerLifecycle and SipServiceListener. Those classes are specified in separate places of Glassfish’es configuration file  (which is located in “domain1/config/domain.xml”). Both those classes implement interface LifecycleListener, but in fact they are two different interfaces with same name! SipContainerLifecycle implements com.sun.appserv.server.LifecycleListener, and SipServiceListener implements org.apache.catalina.LifecycleListener. I don’t know exactly why to use two interfaces, but I suspect that first interface is used for lifecycle of whole cluster, and second interface is for one node.

When SipContainerLifecycle class starts, it reads a list of class names from “stack configuration”. When SipServiceListener starts, it takes this list and creates those classes through reflection. Each class represents processing stage for incoming and outgoing messages. Those classes are called “processing layers” and implement common interface Layer used for bi-directional processing. Some of those layers also have lifecycle methods, which are also invoked by SipServiceListener through reflection. I whould personally prefer having special interface like LayerWithLifecycle which whould extend Layer with lifecycle methods, but I’m not the author. SipServiceListener also connects layers into pipeline-like sequence.

By default there is nothing specified in configuration file, so stack uses hard-coded default configuration (from StackConfig.defineStatic() ). Being able to configure which layers will be present in stack gives a lot of flexibility. For instance you can choose at which level to have overload protection: between parser layer and transaction layer (by default) or before application dispatcher. If you know for sure that your AS will host only proxy-style servlets, then you can remove DialogManager and get small performance boost because your requests will not be checked against ongoing dialogs. Unfortunatelly, it doesn’t work as I thought (see later). However, such generic layered mechanism allows you to create a configuration which doesn’t make sence, for example, you can put DialogManager before TransactionManager. Also most layers just could not be removed without breaking the stack. So I whould prefer having specific interface at each inter-layer border.

Processing of incoming messages

Lowest layer is called NetworkManager and includes functionality which RFC 3261 calls “parsing layer” and “transport layer”. There are two implementations: straightforward one (called OLDNetworkManager) and another based on Grizzly (called GrizzlyNetworkManager). Both of them implement Runnable interface, so SipServiceListener will recognize it and run in dedicated thread. Implementation of run() method in those managers does multiplexed reading from several network channels using NIO Selector.

Then NetworkManager receives some SIP data, it reads them in ByteBuffer and handles for parsing. Parser is hand-written (as oposed to automatically generated parsers based on BNF grammar), so it should be fast. Parser creates object tree for SIP message. Root object is either SipServletRequestImpl or SipServletResponseImpl, and it contains a collection of headers as a map of header name (String) onto collection of values (com.ericsson.ssa.sip.Header). This collection of header values provides common interface for two cases: single value (SingleLineHeader) or variable number of values (MultiLineHeader). I’ve used same approach in my code for a long time, so this looks very familiar to me.

Next layer is TransactionManager. As I expected, it contains two ConcurrentHashMaps for client transactions and for server transactions. Sailfin uses Strings as transaction identifiers, but I prefer using special objects which just correctly implement equals() and hashCode(). Correct String keys require concatenation of several message fields, which sailfin doesn’t implement, so their implementation of transaction lookup is not 100% correct. However, they have // TODO marks that they will implement correct transaction lookup in future. Also I suspect they have a bug what receiving retransmissions of CANCEL will create another NonInviteServerTransactions each time.

After several other layers an incoming message is processed by ApplicationDispatcher. For initial requests ApplicationDispatcher selects a servlet responsible for processing of the request. Based on actions of servlet a container decides if servlet will act as proxy or user agent, and inserts some kind of PathNode (either ProxyContext or UA) into the pipeline. If servlet acts as proxy then forwarded request will also be processed by ApplicationDispatcher, leading either to handling it to another servlet (which will add another PathNode) or to sending request into network.

For subsequent requests ApplicationDispatcher merely passes that message to first PathNode. Responses are processed by servlets in reverse order, so they are passed to last PathNode.

Preliminary conclusion

I’ll continue describing sailfin sources next time. My overall impression about source code is that it is well written and quite professional. Variable, class and method naming is quite good. What I don’t like is an overall bloat and overuse of reflection. I don’t like how logging is done. I perfer that all branching in message processing logic which depends on previous state (like processing on transaction layer, which depends on if transaction exists or not) should be logged.

Testing

Very bad thing about Sailfin is the absence of unit tests. I’ve decided to apply my own suite of unit test and to see what will happen. My unit tests are written meaning that “unit” is a whole container, so they treat container as a “black box” with two interaction points: servlet and network.

I’ve replaced existing NetworkManagers with my own, which doesn’t actually sends anything, but just remembers which bytes were sent to which addresses. Also I’ve replaced ApplicationDispatcher with my simple one, having only one particular Servlet which remembers all requests and responses it was provided with. Each tests does some actions on one side (either network or application) then checks outcome on another side.

While adopting my tests for Sailfin, I’ve found many interesting things. For example, sending is done asynchronously. Also I’ve understood that DialogManager and ResolverManager are mandatory layers.

Modifying Sailfin was not an easy task. It is not as modular as I expected, and horrible singleton pattern is used too often. It seems that bad design prevents it from having unit tests.

With my tests I’ve found several minor bugs (like session.getRemoteParty() and session.getLocalParty() are reversed for incoming requests) and some annoyances (like exception if invoking getProxy() on subsequent requests). The thing which I dislike most is that request-URI or Route header for in-dialog requests must contain special “dialog fragment identifier”. Yes, RFC 3261 specifies that UA should put remote target URI in request-URI, but this is a requirement for another network element, and Sailfin should not depend on it. RFC 3261 clearly states that dialog should be specified only by Call-ID, remote tag and local tag, so “fragment identifier” should not be used.

Conclusion

Sailfin is not yet ready for production. However, it looks quite promising. I think it is really hard to support such huge pile of code without unit tests. Good luck you guys!

Advertisements
Categories: Java, SIP, Telecom Tags: , , , ,
  1. Binod
    July 4, 2008 at 6:41 am

    Look like you have spent serious amount of time on sailfin code base :-) And thanks for the feedback. I have filed 3 issues in sailfin from your feedback.

    https://sailfin.dev.java.net/issues/show_bug.cgi?id=1011
    https://sailfin.dev.java.net/issues/show_bug.cgi?id=1012
    https://sailfin.dev.java.net/issues/show_bug.cgi?id=1013

    About Layering architecture you mentioned, think of augmenting sailfin with an additional capability by a customer.
    A simple case might be some kind of auditing support.

    SIP and related technologies have an ever growing list of RFCs. If a customer need support for a not so common RFC, then you could try to add support for that without modifying supported sailfin code base.

    BTW, Your analysis about sailfin unit tests are not exactly correct. Please see the matrix that unit test matrix that developers run before checking in.
    http://wiki.glassfish.java.net/Wiki.jsp?page=TestMatrix
    Fisheye links for

    1. Developer Tests:
    http://fisheye5.cenqua.com/browse/sailfin/sailfin-tests/community

    2. Quick Look Tests:
    http://fisheye5.cenqua.com/browse/sailfin/sailfin-tests/quicklook

    This is in addition to continuous test execution of a much larger test suite using hudson. Please see the results here.

    https://sailfin.dev.java.net/servlets/SummarizeList?listName=hudson

    Apart from all these, quality and performance team in sailfin (quality@sailfin.dev.java.net) execute a large number of tests (longevity, performance, function tests etc).

    Again, thanks for the great feedback. Please file any issue you face at https://sailfin.dev.java.net/servlets/ProjectIssues.

    Keep using sailfin!

  2. July 7, 2008 at 12:00 pm

    Hi!

    Quite nice analysis and some of the feedback hurts a little since it’s never good to have too many TODO’s left. I just wanted to comment on the Layer structure and just confirm that many of the layers are mandatory as you say. The benefit is to put in new ones or enhance the existing to fit your purposes. One example is if you have a broken client and you must support it. You could write a layer and put it after the NetworkManager. But the other example you mentioned is when we rewrote the network layer. We can introduce a new technology like Grizzly in a controlled way. Another example is the Dialog manager can be in memory but if you have some other mechnism you could rewrite that as we did with session replication.

    One more interesting layer that you probably did not see is the ConvergedLoadBalancer (CLB). That one fixes the CANCEL problem you mentioned. We use a hashing load distribution and the transaction layer is on the backend after the load ballancing. So if you have multiple nodes and the algorithm is working well it should end up on the same instance as the initial invite. Also another interesting thing is to write you own application router rather then the application dispatcher. Ok there is a differance in what can be done then but the router interface is a standard jsr289 interface so it will be more portable.

    But other then that thanks for the feedback and we will try to get the transaction manager more efficient. We have been too focused on the memory consumprion of sessions :-)

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: