Introduction
This is a first article of the series which will study popular Java APIs for SIP: JAIN SIP API and SIP Servlets API. My intention is to analyze what is good and what is bad, and why it is so. These articles represent my personal opinion, however I’m not just going to tag things as “good” or “bad”. Instead, I’ll try to explain why I like or dislike something. The study will focus on technical aspects of APIs. Since I’m not satisfied with current state of affairs, I also want to propose a better API. Yes, the purpose of this study is to justify the need for another SIP API, because I believe that API is very important.
I don’t think that a long intoduction is needed, so let’s start with JAIN SIP API.
JAIN SIP API
This API is quite curious because it is very maximalistic in implementation of SIP syntax and quite minimalistic in implementation of SIP behaviour. I have a very strong impression that the author (exactly as I did) at the beginning has focused his attention on syntax believing that parser is the only re-usable part of SIP, and all messaging scenarios are so volatile that they should be implemented in applications. A version of 1.0 of this API covers only a parser together with a stateless sender/receiver. However, in version 1.2 this API also covers some re-usable behavioural layers, such as transaction layer and dialog layer.
Syntax representation
A noticeable feature of JAIN SIP API is that you can work with protocol syntax without a running stack. You just need to obtain a specific factory and then create syntax objects through it.
There are two interfaces which represent two kinds of SIP messages: Request and Response. These interfaces have a common parent interface Message, which contains common syntax-related functionality. I think that existence of Message interface is a bad thing. Occam’s razor should be applied here: RFC 3261 doesn’t say anything about messages in general, so this abstraction is totally unnecessary. Let’s do a simple check: are there any methods of JAIN SIP API which accept Message as parameter or return it as a result? No. This is an example of OOP people to apply abstraction everywhere, and also an implementation detail showing though an API.
Interfaces Request/Message/Response provide methods for obtaining/adding/removing header fields, modify start line and obtaining/setting body. These methods are defined quite well, except of methods for obtaining headers. Instead of single method which accepts header name and returns abstract Header which should be downcasted, I propose having specific methods for each known header, for examle getViaHeader(). This will make code more clear and will involve compiler into error checking.
JAIN SIP API strives to have an interface for every documented SIP header. Interfaces for all headers are descendants of Header interface. This interface has getName() method, which is correct in my opinion, meaning that header name defines a format for header value. Unfortunatelly, there is no getValue() method, such method is available only for ExtensionHeader. This is bad, because for some headers (like ‘Content-Type’, for example) it is often needed to obtain whole value instead of accessing it’s parts.
I don’t have very much to say about Address, URI and SipURI. They are OK.
As you can see, I’m quite satisfied with syntax part of JAIN SIP API. These objects are just data structures reflecting syntax structure of SIP messages.
Stack management: SipStack, SipProvider and ListeningPoint
SIP stack is managed through interface SipStack. Besides lifecycle methods start() and stop(), this interface also has factory methods for ListeningPoints and for SipProviders.These two classes are then supplied to methods add()/remove() to define which network endpoints will be served by which provider.
ListeningPoint is a combo-class for an InetSocketAddress and a String which specifies a name of a transport protocol. This class could be avoided at all. In all methods where ListeningPoint is passed as a parameter it could be replaced with combo of InetSocketAddress and String. There is a method that returns ListeningPoints, but it can be very well replaced with two methods: one that returns InetSocketAddresses, and another that returns transports for provided InetSocketAddress. So the only point behind this class is to make argument list shorter. I think that it is better to add one or two methods to an existing class then having another useless class.
Interface SipProvider is used to bind a specific SipListener to a specific network endpoint. Thus, JAIN SIP API can only dispatch incoming traffic based on address where it was received. If you can handle all the traffic by single listener – fine, but you still need to create and maintain separate SipProvider for each local endpoint. SipProviders can be added to SipStack or removed from it only while it is stopped. This means that in order to listen on one more port you need to stop listening on all other ports. Such restriction is a very bad idea.
Another responsibility of SipProvider is to be a factory for transactions and dialogs. This implicitly means that all events for these transactions and dialogs will be handled by the listener associated with SipProvider, and all outgoing messages will be sent from the local endpoint associated with this provider. A third responsibility is to serve as a facility for stateless sending of requests and responses.
I suggest that Occam’s razor should be applied to this interface, because it’s responsibilities are vague. It’s essentially a context for a very few things, which could be provided explicitly. All methods of this class could be moved to SipStack, so setListener() method would accept an InetSocketAddress, factory methods for transactions and dialogs whould accept SipListener, and local endpoint for sending whould be choosen automatically.
Transactions
Client and server transactions are represented by interfaces ClientTransaction and ServerTransaction. These interfaces have common parent interface Transaction, which (very similar to Message) has no apparent use and could also be removed without anybody noticing. ClientTransaction has method for sending request and creating a cancel, ServerTransaction has method for sending responses.
Let’s try to apply Occam’s razor to these classes and see if it is possible to replace them with methods. For ServerTransaction, an answer seems to be no, because sometimes server transactions are created automatically by stack.
Maybe a client transaction can be replaced with method sendStatefully() on SipStack? In case of incoming response or timeout, an application needs a context to handle these events. JAIN SIP API is built in a way that transactions are used as contexts, thus transactions are needed. But, maybe it is possible to replace separate factory method and sending method with just one method, which whould send message statefully and return a transaction object? This whould also eliminate vague problems like: what should stack do if ‘Via’ branch of the request has changed after the creation of transaction? The problem with such apporach is caused by threading issue: an event can happen before a thread which invokes sendStatefully() will retrieve transaction, and this event will be handled by another thread which will not find context for the event. However, this problem can be solved by application providing something as a context instead of re-using transactions for that purpose. Thus, sendStatefully() whould accept a context from application, and use that context in ResponseEvent. Thus, client transactions can be avoided.
In fact, transactions do have some support for application-provided contexts through methods setApplicationData() and getApplicationData(). They have been introduced as a convenient way to avoid having lookup facilities for context in applications. A good idea, but it is implemented in API in a way that makes application to be written like in BASIC:
10 let transaction=provider.createTransaction(request);
20 transaction.setApplicationData(context);
30 transaction.sendRequest();
instead of single invocation:
stack.sendRequest(request, context);
Since send() is invoked only once, there is no need to have separate setApplicationData() method.
Transactions have getState() method which returns transaction state as defined in RFC. This method is more an implementation detail rather then useful thing. Applications are not really interested in difference between COMPLETED and TERMINATED states. Instead, they are interested in things like canRespond(), canCancel() or requestSent().
So, what exactly are the responsibilities of transaction classes? Implementation details which they expose are not really useful. The answer is that they are more than transactions as they are defined in RFC.
For client transactions method send() includes some functionality which is common for proxies an user agents, such as arranging “Route” headers and determining of remote address. This funcionality in fact belongs to another layer (as described here).
Another purpose of transaction classes is to serve as protocol-level context. For example, it is possible to obtain original request. ClientTransactions has method createCancel(), which should be in MessageFactory.
In my opinion, cancellation in JAIN SIP API is done very badly, in the same BASIC style:
10 Request cancel = clientTran.createCancel();
20 ClientTransaction cancelTran = provider.createTransaction(cancel);
30 cancelTran.sendRequest();
instead of simple:
stack.cancel(request);
which is not a subject to errors which can occur because of thread races. I mean: what will happen if a final response will be received between lines 20 and 30 ? JAIN SIP API doesn’t give you an answer. Since stack is a multithreaded module, sending should be done through “atomic” actions, doing several things at once by implementing internal locking.
Another example of not taking threading issues into account is an existence of factory method for server transactions. I strongly believe that server transactions should be always implicitly created for incoming requests (except ACK, of course). Usefulness of stateless proxy is minimal, so attempt to support it in a way that JAIN SIP API does doesn’t justify the problems it brings. For example, what will happen if a retransmission is received while application has delayed a processing of incoming request ? Yes, a retransmission will be processed, and application will get an exception when it will try to create a server transaction for retransmitted request.
Processing of incoming CANCEL is another weak point of JAIN SIP API. I think that it is a responsibility of stack to discover which server transaction should be cancelled, but JAIN SIP API makes it a work for application. But even if there would be a special CancelServerTransaction with method getCancelledTransaction(), this behaviour whould be plagued with threading issues. Thus, such improvement will be useful only if server transactions are created automatically.
At least two implementations (NIST SIP and SIP from OpenCloud) do recognize that server transactions should be created automatically. NIST SIP creates a hidden “prototype” for a transaction. Stack of OpenCloud introduces a method on transaction which removes it. These are bad hacks, because problem should be fixed on API level.
Transactions have method getDialog() which should return a dialog corresponding to that transaction. First, I don’t see any practical reason for this method. And second, what should this method do in case of dialog forking?
Dialogs
I’m more or less satisfied with implementation of dialogs in JAIN SIP API. There are several problems with them, but these problems are not as bad as in other areas.
Existence of separate factory methods for normal requests and ACK is a bad idea. Method for incrementing local sequence number is a bad idea.
Ability to set application context for dialog through setApplicationData() is a good thing. To have this method here is not as bad as for transactions, since dialogs live longer, so application context may change. Of course, changing application context can lead to complex errors because of threading, but it is still a useful thing.
JAIN SIP API doesn’t describe how stack should behave in case of dialog forking. An instance of Dialog which was created by application will be returned with response from first destination. Responses from other destinations will return other dialogs, but how application will recognize them ? There are no event like “DialogForked” so application will know that new dialog is related to existing one.
Layering
JAIN SIP API is not a truly layered API. By looking at it you may think that Dialog.sendRequest() invokes ClientTransaction.sendRequest() which, in turn, invokes SipProvider.sendRequest(). While first may be true, the second can’t be, because both ClientTransaction.sendRequest() and SipProvider.sendRequest() perform the same actions which may modify the request (by exchanging “Route” header and request-URI if “Route” value doesn’t have “;lr”). Thus it is not possible to built your own transaction layer on top of SipProvider. It is also not possible to build your own dialog layer, because server dialogs should move to CONFIRMED state when succesfull response is sent through server transaction, or to TERMINATED state when unsuccesfull response is sent, but there are no means for your dialog to be notified about that.
SipListener and events
Application is notified about incoming messages and changes in state of objects through SipListener callback interface. There are just two methods for processing incoming messages: processRequest() and processResponse() so its an application job to dispatch a processing based on content of events. An implementation of these methods are usually trees of “if/else” operators which analyze transactions, dialogs and application data. A much better way would be to allow setting specific listeners for particular transactions and dialogs. These specific listeners would easily replace all application-provided contexts and will eliminate any dispatch code in application. Thus by applying IoC principle to full extent it is possible to turn stack into good protocol-based framework. (Update: I’ve explained this in more detail in subsequent post)
Conclusion
Let’s summarize all what I’ve said about JAIN SIP API.
Advantages:
- Syntax objects are separated from the behavioural part
- Fairly complete
- Easy to understand and use for simple tasks by developers who like BASIC-style imperative programming
- Has semantics which is close to RFC
- Rather easy to implement
- Since it is not restrictive, it is flexible and extensible
Disadvantages:
- Parser for messages is not available
- Stack management is unnesessary complex and restrictive
- Transactions show implementation details rather and badly implement a protocol context
- Doesn’t help with productivity
- Thee ways to send a request. Two ways to send a response.
- Is not fully complete. For example, doesn’t cover proxying.
- Doesn’t prevent you from doing mistakes
- Doesn’t allow you to override some layers
- It is not a real framework (expanded here)
- Has holes in specification
Be careful, you have been warned!
In next article I’ll discuss SIP Servlets API.