Home > Java, SIP, Telecom > A study on Java APIs for SIP. Part 1: JAIN SIP API

A study on Java APIs for SIP. Part 1: JAIN SIP API


This is a first article of the series which will study popular Java APIs for SIP: JAIN SIP API and SIP Servlets API. My intention is to analyze what is good and what is bad, and why it is so. These articles represent my personal opinion, however I’m not just going to tag things as “good” or “bad”. Instead, I’ll try to explain why I like or dislike something. The study will focus on technical aspects of APIs. Since I’m not satisfied with current state of affairs, I also want to propose a better API. Yes, the purpose of this study is to justify the need for another SIP API, because I believe that API is very important.

I don’t think that a long intoduction is needed, so let’s start with JAIN SIP API.


This API is quite curious because it is very maximalistic in implementation of SIP syntax and quite minimalistic in implementation of SIP behaviour. I have a very strong impression that the author (exactly as I did) at the beginning has focused his attention on syntax believing that parser is the only re-usable part of SIP, and all messaging  scenarios are so volatile that they should be implemented in applications. A version of 1.0 of this API covers only a parser together with a stateless sender/receiver. However, in version 1.2 this API also covers some re-usable behavioural layers, such as transaction layer and dialog layer.

Syntax representation

A noticeable feature of JAIN SIP API is that you can work with protocol syntax without a running stack. You just need to obtain a specific factory and then create syntax objects through it.

There are two interfaces which represent two kinds of SIP messages: Request and Response. These interfaces have a common parent interface Message, which contains common syntax-related functionality. I think that existence of Message interface is a bad thing. Occam’s razor should be applied here: RFC 3261 doesn’t say anything about messages in general, so this abstraction is totally unnecessary. Let’s do a simple check: are there any methods of JAIN SIP API which accept Message as parameter or return it as a result? No. This is an example of OOP people to apply abstraction everywhere, and also an implementation detail showing though an API.

Interfaces Request/Message/Response provide methods for obtaining/adding/removing header fields, modify start line and obtaining/setting body. These methods are defined quite well, except of methods for obtaining headers. Instead of single method which accepts header name and returns abstract Header which should be downcasted, I propose having specific methods for each known header, for examle getViaHeader(). This will make code more clear and will involve compiler into error checking.

JAIN SIP API strives to have an interface for every documented SIP header. Interfaces for all headers are descendants of Header interface. This interface has getName() method, which is correct in my opinion, meaning that header name defines a format for header value. Unfortunatelly, there is no getValue() method, such method is available only for ExtensionHeader. This is bad, because for some headers (like ‘Content-Type’, for example) it is often needed to obtain whole value instead of accessing it’s parts.

I don’t have very much to say about Address, URI and SipURI. They are OK.

As you can see, I’m quite satisfied with syntax part of JAIN SIP API. These objects are just data structures reflecting syntax structure of SIP messages.

Stack management: SipStack, SipProvider and ListeningPoint

SIP stack is managed through interface SipStack. Besides lifecycle methods start() and stop(), this interface also has factory methods for ListeningPoints and for SipProviders.These two classes are then supplied to methods add()/remove() to define which network endpoints will be served by which provider.

ListeningPoint is a combo-class for an InetSocketAddress and a String which specifies a name of a transport protocol. This class could be avoided at all. In all methods where ListeningPoint is passed as a parameter it could be replaced with combo of  InetSocketAddress and String. There is a method that returns ListeningPoints, but it can be very well replaced with two methods: one that returns InetSocketAddresses, and another that returns transports for provided InetSocketAddress. So the only point behind this class is to make argument list shorter. I think that it is better to add one or two methods to an existing class then having another useless class.

Interface SipProvider is used to bind a specific SipListener to a specific network endpoint. Thus, JAIN SIP API can only dispatch incoming traffic based on address where it was received. If you can handle all the traffic by single listener – fine, but you still need to create and maintain separate SipProvider for each local endpoint. SipProviders can be added to SipStack or removed from it only while it is stopped. This means that in order to listen on one more port you need to stop listening on all other ports. Such restriction is a very bad idea.

Another responsibility of SipProvider is to be a factory for transactions and dialogs. This implicitly means that all events for these transactions and dialogs will be handled by the listener associated with SipProvider, and all outgoing messages will be sent from the local endpoint associated with this provider. A third responsibility is to serve as a facility for stateless sending of requests and responses.

I suggest that Occam’s razor should be applied to this interface, because it’s responsibilities are vague. It’s essentially a context for a very few things, which could be provided explicitly. All methods of this class could be moved to SipStack, so setListener() method would accept an InetSocketAddress, factory methods for transactions and dialogs whould accept SipListener, and local endpoint for sending whould be choosen automatically.


Client and server transactions are represented by interfaces ClientTransaction and ServerTransaction. These interfaces have common parent interface Transaction, which (very similar to Message) has no apparent use and could also be removed without anybody noticing. ClientTransaction has method for sending request and creating a cancel, ServerTransaction has method for sending responses.

Let’s try to apply Occam’s razor to these classes and see if it is possible to replace them with methods. For ServerTransaction, an answer seems to be no, because sometimes server transactions are created automatically by stack.

Maybe a client transaction can be replaced with method sendStatefully() on SipStack? In case of incoming response or timeout, an application needs a context to handle these events. JAIN SIP API is built in a way that transactions are used as contexts, thus transactions are needed. But, maybe it is possible to replace separate factory method and sending method with just one method, which whould send message statefully and return a transaction object? This whould also eliminate vague problems like: what should stack do if ‘Via’ branch of the request has changed after the creation of transaction? The problem with such apporach is caused by threading issue: an event can happen before a thread which invokes sendStatefully() will retrieve transaction, and this event will be handled by another thread which will not find context for the event. However, this problem can be solved by application providing something as a context instead of re-using transactions for that purpose. Thus, sendStatefully() whould accept a context from application, and use that context in ResponseEvent. Thus, client transactions can be avoided.

In fact, transactions do have some support for  application-provided contexts through methods setApplicationData() and getApplicationData(). They have been introduced as a convenient way to avoid having lookup facilities for context in applications. A good idea, but it is implemented in API in a way that makes application to be written like in BASIC:

10 let transaction=provider.createTransaction(request);

20 transaction.setApplicationData(context);

30 transaction.sendRequest();

instead of single invocation:

stack.sendRequest(request, context);

Since send() is invoked only once, there is no need to have separate setApplicationData() method.

Transactions have getState() method which returns transaction state as defined in RFC. This method is more an implementation detail rather then useful thing. Applications are not really interested in difference between COMPLETED and TERMINATED states. Instead, they are interested in things like canRespond(), canCancel() or requestSent().

So, what exactly are the responsibilities of transaction classes? Implementation details which they expose are not really useful. The answer is that they are more than transactions as they are defined in RFC.

For client transactions method send() includes some functionality which is common for proxies an user agents, such as arranging “Route” headers and determining of remote address. This funcionality in fact belongs to another layer (as described here).

Another purpose of transaction classes is to serve as protocol-level context. For example, it is possible to obtain original request. ClientTransactions has method createCancel(), which should be in MessageFactory.

In my opinion, cancellation in JAIN SIP API is done very badly, in the same BASIC style:

10 Request cancel = clientTran.createCancel();

20 ClientTransaction cancelTran = provider.createTransaction(cancel);

30 cancelTran.sendRequest();

instead of simple:


which is not a subject to errors which can occur because of thread races. I mean: what will happen if a final response will be received between lines 20 and 30 ? JAIN SIP API doesn’t give you an answer. Since stack is a multithreaded module, sending should be done through “atomic” actions, doing several things at once by implementing internal locking.

Another example of not taking threading issues into account is an existence of factory method for server transactions. I strongly believe that server transactions should be always implicitly created for incoming requests (except ACK, of course). Usefulness of stateless proxy is minimal, so attempt to support it in a way that JAIN SIP API does doesn’t justify the problems it brings. For example, what will happen if a retransmission is received while application has delayed a processing of incoming request ? Yes, a retransmission will be processed, and application will get an exception when it will try to create a server transaction for retransmitted request.

Processing of incoming CANCEL is another weak point of JAIN SIP API. I think that it is a responsibility of stack to discover which server transaction should be cancelled, but JAIN SIP API makes it a work for application. But even if there would be a special CancelServerTransaction with method getCancelledTransaction(), this behaviour whould be plagued with threading issues. Thus, such improvement will be useful only if server transactions are created automatically.

At least two implementations (NIST SIP and SIP from OpenCloud) do recognize that server transactions should be created automatically. NIST SIP creates a hidden “prototype” for a transaction. Stack of OpenCloud introduces a method on transaction which removes it. These are bad hacks, because problem should be fixed on API level.

Transactions have method getDialog() which should return a dialog corresponding to that transaction. First, I don’t see any practical reason for this method. And second, what should this method do in case of dialog forking?


I’m more or less satisfied with implementation of dialogs in JAIN SIP API. There are several problems with them, but these problems are not as bad as in other areas.

Existence of separate factory methods for normal requests and ACK is a bad idea. Method for incrementing local sequence number is a bad idea.

Ability to set application context for dialog through setApplicationData() is a good thing. To have this method here is not as bad as for transactions, since dialogs live longer, so application context may change. Of course, changing application context can lead to complex errors because of threading, but it is still a useful thing.

JAIN SIP API doesn’t describe how stack should behave in case of dialog forking. An instance of Dialog which was created by application will be returned with response from first destination. Responses from other destinations will return other dialogs, but how application will recognize them ? There are no event like “DialogForked” so application will know that new dialog is related to existing one.


JAIN SIP API is not a truly layered API. By looking at it you may think that Dialog.sendRequest() invokes ClientTransaction.sendRequest() which, in turn, invokes SipProvider.sendRequest(). While first may be true, the second can’t be, because both ClientTransaction.sendRequest() and SipProvider.sendRequest() perform the same actions which may modify the request (by exchanging “Route” header and request-URI if “Route” value doesn’t have “;lr”). Thus it is not possible to built your own transaction layer on top of SipProvider. It is also not possible to build your own dialog layer, because server dialogs should move to CONFIRMED state when succesfull response is sent through server transaction, or to TERMINATED state when unsuccesfull response is sent, but there are no means for your dialog to be notified about that.

SipListener and events

Application is notified about incoming messages and changes in state of objects through SipListener callback interface. There are just two methods for processing incoming messages: processRequest() and processResponse() so its an application job to dispatch a processing based on content of events. An implementation of these methods are usually trees of “if/else” operators which analyze transactions, dialogs and application data. A much better way would be to allow setting specific listeners for particular transactions and dialogs. These specific listeners would easily replace all application-provided contexts and will eliminate any dispatch code in application. Thus by applying IoC principle to full extent it is possible to turn stack into good protocol-based framework. (Update: I’ve explained this in more detail in subsequent post)


Let’s summarize all what I’ve said about JAIN SIP API.


  • Syntax objects are separated from the behavioural part
  • Fairly complete
  • Easy to understand and use for simple tasks by developers who like BASIC-style imperative programming
  • Has semantics which is close to RFC
  • Rather easy to implement
  • Since it is not restrictive, it is flexible and extensible


  • Parser for messages is not available
  • Stack management is unnesessary complex and restrictive
  • Transactions show implementation details rather and badly implement a protocol context
  • Doesn’t help with productivity
  • Thee ways to send a request. Two ways to send a response.
  • Is not fully complete. For example, doesn’t cover proxying.
  • Doesn’t prevent you from doing mistakes
  • Doesn’t allow you to override some layers
  • It is not a real framework (expanded here)
  • Has holes in specification

Be careful, you have been warned!

In next article I’ll discuss SIP Servlets API.

Categories: Java, SIP, Telecom Tags: , ,
  1. whitelassiblog
    July 7, 2009 at 3:56 pm

    I had some questions:

    1. What is meant by that JAIN SIP does not cover proxying ?

    2. what are the holes in the specification ?

    3. What stack layers do you wish to override? and why ? I have not have any such requirement.

    4. Stack management is complex? as in ? What do you wish to manage at the stack level (at runtime) ?

    5. Message parser ?? Headers are parsed as per ABNF, arranged in a message and given to the application. How do you parse messages ? Where does the SIP spec mandate message parsing ?

    6. What is meant by a real framework ? A stack is not a framework. A framework has to be built on top of the stack. Eg: SIP Servlets, JAIN SLEE etc.

  2. whitelassiblog
    July 7, 2009 at 4:04 pm

    PS: WRT to the layering part..

    ClientTransaction.sendRequest() sends a request transaction statefully and SipProvider.sendRequest() sends it statelessly (for implementing stateless SIP proxies of RFC 3261).

    WRT forking:

    DialogForked events are not the responsibility of the sip stack. Forking proxies are a kind of SIP application servers. It is the responsibility of the UAC/UAS core to identify such requests and take necessary action as per RFC 3261 procedures. DialogForked events are best delivered by a framework (on top of the stack) like the JAIN SLEE SIP RA.

  3. July 13, 2009 at 8:02 am

    Hello! Sorry for late answers, just back from vacation.

    1. When I say that JAIN SIP API does not cover proxying, I mean that API does not have an interface for proxy facility, so you have to write it yourself or use some implementation-specific facility. In contrast, SIP Servlet API includes such facility (see interface ‘javax.servlet.sip.Proxy’).

    2. Holes. Too many to list them here. Full answer requires another post. But, just to give an idea: when you create a transaction based on a request, does stack create a copy of a request or uses an original request? This is important and should be specified.

    3. You are asking why I want to override some stack layers. For example, to implement fault-tolerant solution by using transaction replication. Service Intraction is another example: outgoing request is not actually sent to network but returned to upper layers. This logic should be put just on top of transaction layer. I call this layer “client/server” layer, for details please check this post.

    4. When I say that “stack management is complex” I mean that having SipStack, SipProvider and ListeningPoint is too much for simple tasks. I’m not a stupid guy so I can handle this, but I don’t want to do things in such way. I want to be able to add/remove listening points at runtime. Graceful shutdown is also a very useful feature.

    5. Yes, message parser. What stack receives from a network is a message, consisting of a start line, headers and a body. First stack parses a whole message, then it parses start line and each header. Chapter 7 of RFC 3261 contains ABNF for message, request/response start lines and headers.

    6. I call stack a framework because it uses “Inversion of control” pattern. SIP Servlets and JAIN SLEE are more than a frameworks: they are application servers, because they use inversion of control not only for processing of incoming messages, but also for dependency injection. In brief it was explained in “SipListener and events”. Full answer requires a separate post.

    7. I strongly suspect that you didn’t undestood what I mean. First, forking proxies are not a kind of application servers. They are defined in RFC 3261 which doesn’t specify software architecture. You can build them without using application server approach (for example, as monolitic application or on top of JAIN SIP API). Second, dialog forking does not take place on proxy node, it takes place on UAC node. For details, please refer to this post. Third, then dialog has been forked, a new representation for dialog state should be created (it will have a different remote tag, different route set, different remote target), and it cannot be implemented on top of stack, only in stack itself. And last, you make a mistake then you call JAIN SLEE SIP RA a “framework”. This is just an adaptor of one framework to another, with zero protocol logic.

  4. horsson
    July 21, 2009 at 5:38 pm


  5. July 27, 2009 at 10:09 pm


    I must implement a java api for sip click to call , i use a asterisk pbx.

    I want use a mature open source library for java sip application.

    I have some doubt for jain, is this the best choice ?

    thanks in advance

  6. whitelassiblog
    August 15, 2009 at 8:56 am


    Sorry for the late response. Nice posts..

    1. The proxy facility is not the function of the SIP Stack. A SIP proxy resides on top of the transaction layer of the stack (if its stateful). If you are developing a SIP Proxy server, you are the Transaction User (TU).

    If the stack starts providing utilities to implement proxy behavior, it will not remain just a stack. It will start resembling a container, which it is not. I agree that Inversion of Control is one of the many characteristics of a framework.

    However, according to me, in addition to IoC, once a software component starts deciding the life cycle, sequencing and behavior of your application logic, then it becomes a framework. The SIP Stack is only providing you with the SIP messages. What you do with them and how you do it is the part of the application logic.

    The SIP Stack cannot govern your application’s business logic (whether you wish to use dialogs or not, whether you are transaction stateful or stateless etc) , it can only assist the application by providing a ‘standard’ interface to communicate with other SIP entities and exposing a friendly API to go about creating requests/responses and dialogs.

    Just like you mentioned in the point about the SIP Servlet API (and its implementation in the form of a container), in your response above, it is important to note that it is a full application server..and not a generic SIP Stack. If you look at the Mobicents SIP Servlets implementation, they also ‘build upon’ the JAIN SIP Stack to provide the application (SIP Servlet) with proxy utilities and even B2BUA utilities. It even provides utilities to route sip messages, STUN support, DNS utilities etc that contribute to the ease of development even if the developer has modest knowledge of SIP.

    2. When the stack creates a transaction for a request, it stores the original request as part of the transaction context for retransmissions (to be sent by the CT or filtered at the ST). At the application, you need to get the request from the transaction, clone it and then work on it (add/remove headers, decrement max forwards etc). If there are more holes, it will be very nice to dedicate a post for it. It will be worth it and will be for the better of the community.

    3. For fault tolerance, you need not override the stack layers/or behavior. As for example, if you need fault tolerant dialogs, you will first need to use a transactional cache (such as JBOSS cache) and simply cache the created dialog object and then retrieve it later like this from the application and re-create it in the stack:

    //a. Get the dialog object from the Persistent Cache like JBOSS Cache here.

    //b. Set the SipProvider for the Dialog like this:

    //c. Put the dialog in the SIP stack’s dialog table here: ((SipStackImpl)sipProvider.getSipStack()).putDialog((SIPDialog)null);

    Where is the need for overriding the stack’s behavior here ?

    4. Listeners, Providers etc are present in every stack. I have worked with SIP Stacks from 3 more vendors (implemented in C++). They are top vendors in this space. Stack management in their products was similar to what JAIN SIP provides ‘conceptually’.
    By the way, why do you wish to add/remove listening points at runtime? You have a multi-homed host? It is possible in JAIN SIP to create/delete Listening Points and Sip providers at runtime. Refer to this: http://snad.ncsl.nist.gov/proj/iptel/jain-sip-1.2/javadoc/javax/sip/SipStack.html

    5. Message Parsing is already happening at the SIP Stack. What i understand is that you want the stack to provide a more friendly API to return the Start line well parsed with the SIP version, method, request line separately using getters. This is also being done at the Stack.

    Use the getters of the RequestLine object to get the sip version, uri, method etc like this:

    Request request = requestEvent.getRequest();
    Request testing = requestEvent.getRequest();
    RequestLine line = ((SIPRequest)testing).getRequestLine();


    6. Nice post. If for once we start calling the SIP Stack a framework, even then functions such as proxying, or other helper utilities are outside its scope. As per 3261, it is the responsibility of the Transaction User (TU).

    7. The JAIN SLEE SIP RA contains protocol logic. Refer to Appendix D a of JSR 240. It has its own transparent handling of request cancellation and also request forking. The SIP RA also fires the DialogForked events, characterized by javax.sip.Dialog.FORKED net.java.slee 1.2 event type. The SIP RA also provides an API to create dialogs at the stack and activities at the RA. The DialogActivity interface of the SIP RA provides dialog management utilities..like associating SIP transactions to dialogs, creating requests and responses that need to be sent in-dialog etc. All these protocol specific utilities are provided in the SIP RA SBB interface and are documented in the specification. If you look at appendix section D.12.1, it provides the entire state machine implemented at the SIP RA for its internal dialog management, which of course matches the states given in 3261. Section D.14.3 shows the SIP RA’s implementation of forking behavior and its FSM.

    Thus the RA is indeed an architectural framework of the JSLEE standard and is not just a plain adaptation layer for the JSIP API. It can encapsulate any network protocol (standard or proprietary), and provides a well defined life cycle for all its activities and passes events of interests to event router. It influences the application’s behavior by asking it to attach/detach to activity contexts, that the SLEE container maps from the Resource Adaptor. If we remove the RAs from the JSLEE container, it will be rendered redundant. Thats why i call it as part of a framework. The Container cannot be used without the RA.

  7. whitelassiblog
    August 15, 2009 at 9:02 am

    PS: When i said that forking proxies are a kind of app servers, i meant that they are a kind of UAC core / proxy servers. I didn’t mean that they require a JSLEE/Sip Servlet app server to be implemented.

  8. August 20, 2009 at 8:34 am


    1. The debate where is a border between stack and application is a long one. Let’s do it one more time.

    There are three layers which I can see: stack, container and application, each one has clear responsibilities. Stack handles protocol-level logic, application handles business logic, container “manages” them: wires them together (sets application as listener of stack) and controls their lifecycle. JAIN SIP API is an API between stack and container+application. SIP Servlets API is an API between stack+container and application.

    You say that “If you are developing a SIP Proxy server, you are the Transaction User (TU)”. Yes, I know that. I suppose you mean that criteria is that “Stack is everything below TU”. But this is not the case for JAIN SIP API. This API supports dialogs, which are also part of TU layer. I strongly suspect that JAIN SIP API doesn’t have a real criteria of what should be a part of stack: they just include features they feel nesessary.

    You say that “The SIP Stack cannot govern your application’s business logic whether you wish to use dialogs or not, whether you are transaction stateful or stateless etc)” – I totally agree, and I never said the opposite. When I say that stack should support proxying, then I mean that it is application who will decide if message will be handled in UAS way, B2BUA way or Proxy way. But I believe that inventing proxy over and over again in applications is a bad idea, especially because proxying is so well specified. And I don’t see how support of proxying turns SipServlet API into “app server/container” type.

    So, I again say that stack having features from TU layer such as suport of dialogs and support of proxying is a good thing. It is stack which is re-usable, and application which is volatile.

    And I totally agree that Sip Servlets API is not a generic stack, but an application server. You cannot build a softphone on it. My idea of “perfect” stack is that it will be even more powerful and easy to use as SipServlet container, but without “container” module and overhead of “enterprize features” like deployment. Such stack could be used in building both desktop applications and enterprize applications. A SipServlet layer on top of such stack will be very thin.

    2. I’ll do it

    3. I meant that JAIN SIP API doesn’t allow you to implement fault tolerance in portable way. Your example applies only to NIST implementation.

    4. I’m sorry, but this doesn’t convince me. Even if they “are top vendors in this space”, such stack management is shit. One interface is enough. SipProvider and ListeningPoint are bad. Three entities instead of one: this brings relationships between them. Why overcomplicate things?

    5. Like in 3, I meant an API feature, not a NIST stack feature.

    6. Thank you for that information, I didn’t new that. So, JAIN SIP RA includes both protocol-level features and SLEE-specific features. In my opinion, this is not good. RA should contain only SLEE-specific features, and all protocol-level features should be part of stack. This is my opinion about modular design: functionality of one type is in one module, and different funcitonalities are in separate modules.

    Please come back, nice to talk with you!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: