Archive for the ‘Java’ Category

Is network stack a framework ?

July 13, 2009

My recent articles about JAIN SIP API and SIP Servlets API often mention a term “framework”. I’ve planned to discuss this term in regard to SIP stack in depth, but forgot. Of course, this caused some questions, thus I’m doing it now.

I’m calling software module a “framework” if it is built with “Inversion of control” pattern. You provide callbacks, and framework invokes them according to it’s specification. Sometimes you can manage frameworks, but you cannot customize it beyond certain degree. Your code is not active, it is either passive or reactive.

Event-driven programming model is a one example of framework architecture. Software containers is another example.

Now you can see why I call SIP stacks which implement JAIN SIP API or SIP Servlets API as frameworks. They read data from network, handle them and then invoke a supplied listener (either SipServlet or SipListener). This invocation takes place in a thread of a SIP stack, so if you should not block it. JAIN SIP API has dispatch of incoming traffic based on local endpoint, and SIP Servlets API has method-based dispatch, but this is not a very significant difference.

Why SIP stacks are implemented as frameworks? To answer this, let’s imagine a stack which is implemented as a library. So, you create a socket, read from it, and then pass a byte array to stack for handling. SIP stack will return a result to you in functional style:

ByteBuffer data = ByteBuffer.allocate(MAX_MESSAGE);

SocketAddress remote = channel.receive(data);

SipResult result = stack.handle(data, remote);

Since there are many possible results of handling SIP messages, now you should analyze the result and dispatch according to it: was the message parsed correctly or not, was it request or response, was it a retransmission or not, and many other choices. If request was parsed correctly but has some mandatory headers missing, then result should contain error response which you can send through stack object. Such dispatch code is large, and should be written once because it’s behaviour is well specified in RFC 3261. This is a first reason why stacks are implemented as frameworks: they include common dispach code.

A second reason is that application programmers often afraid of working with threads and sockets directly. They consider that to be “system-level” code, which should be hidden from them. Developers of SIP stacks should bother about performance, race conditions and other complex stuff.

Thus, SIP stacks are frameworks, and I think that this is a right way. By the way, most HTTP stacks are also frameworks.

Now I will explain why I think that JAIN SIP API and SIP Servlets API are not perfect frameworks.

JAIN SIP API has a single callback called SipListener. It has only two methods for processing incoming messages: processRequest() and processResponse(). Thus, SIP stack does very little dispatch for you. If you are doing a stateless proxy, you’ll have very simple logic there. But for UA and statefull proxy there will be one large “if” statement. It could be implemented in different ways. One way is to map transactions and dialog on your application contexts. In this case you’ll have to look up into maps. Another way is to bind application contexts to transactions and dialogs using setApplicationData() method. In this case you’ll need to invoke getApplicationData() then cast it to your application context. When you have your application context you have additional dispatch. JAIN SIP API is flexible here, but this dispatching code is reusable, thus it should be written once. This dispatch code makes a better protocol framework on top of framework provided by SipListener.

A better protocol framework should have the following capabilities:

  • ServerTransactionListener, which can be provided to specific server transaction. This listener will be notified when transaction terminates, when final response retransmission timeout happens, and when CANCEL is received
  • ClientTransactionListener, which can be provided to specific client transaction. This listener will be notified when response is received
  • DialogListener, which can be provided to specific dialog. This listener will be notified when dialog has been forked and when dialog has been terminated
  • ServerListener, which is invoked for incoming requests. There should be one “global”  listener, and there could be specific listener for each dialog.

Such protocol framework will allow you to write applications with much less dispatch code.

Most of that is also true for SIP Servlets API. You have to extract attributes from SipSession and dispatch your execution based on them. However, they have some things better:

  • You can specify which servlet will handle which SipSession. Unfortunatelly, servlets are stateless.
  • Method-based dispatch is provided by SipServlet class

Thus, SIP Servlets API doesn’t provide a powerful protocol framework. Instead, they provide application framework: you can compose servlets, you have listener for various things such as binding attributes to sessions.

I hope I have explained why I consider SIP stacks to implement “framework” architecture. I also hope I have explaided why I think that it could be a better frameworks.

And, finally, what I’m calling an “application server” ? An application server:

  • Is a server for some protocol
  • Is implemented as a framework for this protocol
  • Is implemented as component container

Thus, SIP Servlets API and JAIN SLEE are describing application server, but JAIN SIP API is not.

A study on Java APIs for SIP. Part 3: future

May 28, 2009

Comparing JAIN SIP API and SIP Servlets API makes it clear that these APIs play on different fields.

JAIN SIP API focuses on implementation simplicity. It is not a stack, but a Stack SDK. If you need to quickly implement some well-known call-flow, JAIN SIP API is a bad choice. But if you want to implement some new SIP feature, then JAIN SIP API may come handy.

SIP Servlets API focuses on developer’s productivity and enforces implementation to be very powerful, because it is not an API for SIP stack, but API provided by application server. It is especially suited for well-known areas of SIP. However, it doesn’t allow you to do something unusual. Other minor drawbacks are bloat and manual contextualization.

By the way, there is an interesting move by company called OpenCloud. They support two APIs: one is JAIN SIP API, just a little bit extended, and another is called “EasySIP“, which is a complete rip-off of SIP Servlets API. They have introduced separate classes for incoming and outgoing requests and responses.

While both APIs could be used with some effort, I’m not satisfied with them. I want an API which:

  • Has stack management methods much more powerful then in JAIN SIP API. It should be possible to add/remove local endpoints while stack is running. A listener should receive a reference local endpoint with incoming requests.
  • Has much better built-in support for “protocol context”. For incoming ACK and PRACK it should be possible to get a response it acknowledges. For incoming CANCEL it should be possible to get a request it cancels
  • Has syntax separated from behaviour.
  • Will produce compilation errors if trying to respond on incoming ACK or CANCEL, to respond on outgoing request, to acknowledge outgoing response, to cancel incoming response.
  • Is a protocol framework. It should be possible to specify listener for any particular client transaction, server transaction and dialog.

Such API will be powerfull enough so it will provide the same level of developer’s productivity as SIP Servlets API. But it will also be extensible, and could be used in any type of applications, not only on server side.

A study on Java APIs for SIP. Part 2: SIP Servlets API

May 19, 2009

Now let’s take a look on JAIN SIP API’s younger brother called SIP Servlets API. This guy was ambitious from the birth, so he joined a mob called “JEE” to receive money from big business. But big business never gives money for free, so SIP Servlets API had to cover all behaviour specified for SIP and to provide as large framework as possible, so programmers who work for big business will not bother thinking about protocol details and about execution model. An attempt has failed miserably, because SIP, unlike HTTP, is not about content.

SipServlet interface

First obvious problem of SIP Servlets API is that it is an extension of Generic Servlets API. The cornerstone class of Generic Servlets API is called Servlet, and has only three methods: init(), service() and destroy(). In my opinion, the main problem here is that this class is essentially a too generic framework. Behaviour of container is always protocol-dependent, so servlets are also protocol-dependant. Still, everybody are forced to use this narrow interface for interaction. Are there any convergent servlets which handle several protocols through their service() method ? No, there are convergent applications instead. I think that service() method should be removed from Servlet class. Instead, all concrete servlets would have their own service() method, accepting protocol-specific requests and responses.

I understand that most people don’t bother with problem of downcasting request and response in service() method, so it is not a big deal. I just don’t like it working in this way.

Applications always do some dispatch for incoming messages, so APIs provide some dispatch out-of-box to help application developers. Unlike JAIN SIP API, which has dispatch based on local endpoint and on message type (request/response), SIP Servlets API have built-in dispatch based on request method and response status code. I think that this approach is more useful for applications.

Syntax

Next bad thing in SIP Servlets API is an existence of SipServletMessage interface. It’s exactly the same case as with JAIN SIP API: this abstraction is not used by anyone. Yes, it is good that signatures of methods in SipServletRequest and SipServletResponse are the same, but nothing will break if these signatures would differ. Authors of SIP Servlet API have ignored the lesson of HTTP Servlets API, which doesn’t have common interface. I understand that HTTP is much more assymmetric then SIP, so in HTTP servlets people are reading headers of requests and writing headers and body to responses, so syntaxic similarity is not related to behaviour. But I still don’t see how syntaxic symmetry of SIP requests and responses could be used in practice.

Syntax part of SIP Servlets API is much smaller and simpler then of JAIN SIP API. Obtaining header will return you a string, adding or changing a header will also accept value as string. Additional parsing is supported only for address headers.

Behaviour

SipServletMessage interface includes not only syntax-related methods but also method send(). Message is a nice context for method send(), because it is the message which should be sent. The fact that method send() belongs to SipServletMessage interface shows that SIP Servlets API doesn’t strive for separation of syntax and behaviour. SipServletMessage is not just a header, start line and body, but it is a gateway to SIP stack, hiding transactions, dialogs and all other protocol layers. This means that you can’t just “forward” incoming message, because it can’t be separated from all internal state. Instead you should either create a new message and manually copy all necessary data from original message into new message, or use hacks provided by API (like proxyTo() or createRequest() method which accepts original request). Thus the amount of interfaces in API is low, but the amount of behavioural methods is large, and their semantics is more complex. However, as long as “message=syntax+state” approach was selected, all methods which implement SIP behaviour are also belong to SipServletRequest or SipServletResponse classes, so message is a context for an action. Such approach is easy to understand by beginners.

Another problem of method send() is that compiler will not complain if you’ll try to invoke it for incoming message. Of course SIP stack will not send such message, you’ll get IllegalStateException in runtime. Thus, SIP Servlets API is not designed to use type system for preventing errors.  A correct solution would be to have separate classes like IncomingRequest, OutgoingRequest, IncomingResponse, OutgoingResponse. There are other behaviour-related methods which could be moved to specific classes instead of throwing an IllegalStateException: createCancel(), createResponse(), createAck(). Some special requests (like ACK and CANCEL) could also be represented by specific classes which would not have methods createResponse().

SIP Servlets API provide large application framework, with different listeners. However, it doesn’t provide a powerfull protocol framework. Instead, you have a SipSession which represents either dialog or proxying session. This SipSession is used for two purposes. First, as a factory for in-dialog requests. Second, as a storage for application context. This means that for every incoming subsequent message an application should restore its context from SipSession object. Such manual contextualization is not a very convenient thing to program, but it allows application server to be distributed and fault-tolerant. Ability to set a servlet which will handle the session can’t be considered as a protocol framework, since that servlet also can’t have its own context and should contextualize itself from SipSession. A little bit more information is given in subsequent post.

Conclusion

SIP Servlets API is not an API for SIP stack. Instead it is an API for application server which means that it tries to be as complete as possible and it is not designed to be extensible. Container should provide all nesessary functionality, and servlets should just contain business logic needed to handle incoming messages. This makes it attactive to beginners who enjoy it’s protocol power. This API will not allow you to violate SIP rules, however it is usually done by throwing exceptions in runtime. Protocol framework is absent, all you have is a SipSession to store and restore your context.

Next article will compare both APIs, will discuss lots of general API-related stuff and will propose better solutions.

A study on Java APIs for SIP. Part 1: JAIN SIP API

May 15, 2009

Introduction

This is a first article of the series which will study popular Java APIs for SIP: JAIN SIP API and SIP Servlets API. My intention is to analyze what is good and what is bad, and why it is so. These articles represent my personal opinion, however I’m not just going to tag things as “good” or “bad”. Instead, I’ll try to explain why I like or dislike something. The study will focus on technical aspects of APIs. Since I’m not satisfied with current state of affairs, I also want to propose a better API. Yes, the purpose of this study is to justify the need for another SIP API, because I believe that API is very important.

I don’t think that a long intoduction is needed, so let’s start with JAIN SIP API.

JAIN SIP API

This API is quite curious because it is very maximalistic in implementation of SIP syntax and quite minimalistic in implementation of SIP behaviour. I have a very strong impression that the author (exactly as I did) at the beginning has focused his attention on syntax believing that parser is the only re-usable part of SIP, and all messaging  scenarios are so volatile that they should be implemented in applications. A version of 1.0 of this API covers only a parser together with a stateless sender/receiver. However, in version 1.2 this API also covers some re-usable behavioural layers, such as transaction layer and dialog layer.

Syntax representation

A noticeable feature of JAIN SIP API is that you can work with protocol syntax without a running stack. You just need to obtain a specific factory and then create syntax objects through it.

There are two interfaces which represent two kinds of SIP messages: Request and Response. These interfaces have a common parent interface Message, which contains common syntax-related functionality. I think that existence of Message interface is a bad thing. Occam’s razor should be applied here: RFC 3261 doesn’t say anything about messages in general, so this abstraction is totally unnecessary. Let’s do a simple check: are there any methods of JAIN SIP API which accept Message as parameter or return it as a result? No. This is an example of OOP people to apply abstraction everywhere, and also an implementation detail showing though an API.

Interfaces Request/Message/Response provide methods for obtaining/adding/removing header fields, modify start line and obtaining/setting body. These methods are defined quite well, except of methods for obtaining headers. Instead of single method which accepts header name and returns abstract Header which should be downcasted, I propose having specific methods for each known header, for examle getViaHeader(). This will make code more clear and will involve compiler into error checking.

JAIN SIP API strives to have an interface for every documented SIP header. Interfaces for all headers are descendants of Header interface. This interface has getName() method, which is correct in my opinion, meaning that header name defines a format for header value. Unfortunatelly, there is no getValue() method, such method is available only for ExtensionHeader. This is bad, because for some headers (like ‘Content-Type’, for example) it is often needed to obtain whole value instead of accessing it’s parts.

I don’t have very much to say about Address, URI and SipURI. They are OK.

As you can see, I’m quite satisfied with syntax part of JAIN SIP API. These objects are just data structures reflecting syntax structure of SIP messages.

Stack management: SipStack, SipProvider and ListeningPoint

SIP stack is managed through interface SipStack. Besides lifecycle methods start() and stop(), this interface also has factory methods for ListeningPoints and for SipProviders.These two classes are then supplied to methods add()/remove() to define which network endpoints will be served by which provider.

ListeningPoint is a combo-class for an InetSocketAddress and a String which specifies a name of a transport protocol. This class could be avoided at all. In all methods where ListeningPoint is passed as a parameter it could be replaced with combo of  InetSocketAddress and String. There is a method that returns ListeningPoints, but it can be very well replaced with two methods: one that returns InetSocketAddresses, and another that returns transports for provided InetSocketAddress. So the only point behind this class is to make argument list shorter. I think that it is better to add one or two methods to an existing class then having another useless class.

Interface SipProvider is used to bind a specific SipListener to a specific network endpoint. Thus, JAIN SIP API can only dispatch incoming traffic based on address where it was received. If you can handle all the traffic by single listener – fine, but you still need to create and maintain separate SipProvider for each local endpoint. SipProviders can be added to SipStack or removed from it only while it is stopped. This means that in order to listen on one more port you need to stop listening on all other ports. Such restriction is a very bad idea.

Another responsibility of SipProvider is to be a factory for transactions and dialogs. This implicitly means that all events for these transactions and dialogs will be handled by the listener associated with SipProvider, and all outgoing messages will be sent from the local endpoint associated with this provider. A third responsibility is to serve as a facility for stateless sending of requests and responses.

I suggest that Occam’s razor should be applied to this interface, because it’s responsibilities are vague. It’s essentially a context for a very few things, which could be provided explicitly. All methods of this class could be moved to SipStack, so setListener() method would accept an InetSocketAddress, factory methods for transactions and dialogs whould accept SipListener, and local endpoint for sending whould be choosen automatically.

Transactions

Client and server transactions are represented by interfaces ClientTransaction and ServerTransaction. These interfaces have common parent interface Transaction, which (very similar to Message) has no apparent use and could also be removed without anybody noticing. ClientTransaction has method for sending request and creating a cancel, ServerTransaction has method for sending responses.

Let’s try to apply Occam’s razor to these classes and see if it is possible to replace them with methods. For ServerTransaction, an answer seems to be no, because sometimes server transactions are created automatically by stack.

Maybe a client transaction can be replaced with method sendStatefully() on SipStack? In case of incoming response or timeout, an application needs a context to handle these events. JAIN SIP API is built in a way that transactions are used as contexts, thus transactions are needed. But, maybe it is possible to replace separate factory method and sending method with just one method, which whould send message statefully and return a transaction object? This whould also eliminate vague problems like: what should stack do if ‘Via’ branch of the request has changed after the creation of transaction? The problem with such apporach is caused by threading issue: an event can happen before a thread which invokes sendStatefully() will retrieve transaction, and this event will be handled by another thread which will not find context for the event. However, this problem can be solved by application providing something as a context instead of re-using transactions for that purpose. Thus, sendStatefully() whould accept a context from application, and use that context in ResponseEvent. Thus, client transactions can be avoided.

In fact, transactions do have some support for  application-provided contexts through methods setApplicationData() and getApplicationData(). They have been introduced as a convenient way to avoid having lookup facilities for context in applications. A good idea, but it is implemented in API in a way that makes application to be written like in BASIC:

10 let transaction=provider.createTransaction(request);

20 transaction.setApplicationData(context);

30 transaction.sendRequest();

instead of single invocation:

stack.sendRequest(request, context);

Since send() is invoked only once, there is no need to have separate setApplicationData() method.

Transactions have getState() method which returns transaction state as defined in RFC. This method is more an implementation detail rather then useful thing. Applications are not really interested in difference between COMPLETED and TERMINATED states. Instead, they are interested in things like canRespond(), canCancel() or requestSent().

So, what exactly are the responsibilities of transaction classes? Implementation details which they expose are not really useful. The answer is that they are more than transactions as they are defined in RFC.

For client transactions method send() includes some functionality which is common for proxies an user agents, such as arranging “Route” headers and determining of remote address. This funcionality in fact belongs to another layer (as described here).

Another purpose of transaction classes is to serve as protocol-level context. For example, it is possible to obtain original request. ClientTransactions has method createCancel(), which should be in MessageFactory.

In my opinion, cancellation in JAIN SIP API is done very badly, in the same BASIC style:

10 Request cancel = clientTran.createCancel();

20 ClientTransaction cancelTran = provider.createTransaction(cancel);

30 cancelTran.sendRequest();

instead of simple:

stack.cancel(request);

which is not a subject to errors which can occur because of thread races. I mean: what will happen if a final response will be received between lines 20 and 30 ? JAIN SIP API doesn’t give you an answer. Since stack is a multithreaded module, sending should be done through “atomic” actions, doing several things at once by implementing internal locking.

Another example of not taking threading issues into account is an existence of factory method for server transactions. I strongly believe that server transactions should be always implicitly created for incoming requests (except ACK, of course). Usefulness of stateless proxy is minimal, so attempt to support it in a way that JAIN SIP API does doesn’t justify the problems it brings. For example, what will happen if a retransmission is received while application has delayed a processing of incoming request ? Yes, a retransmission will be processed, and application will get an exception when it will try to create a server transaction for retransmitted request.

Processing of incoming CANCEL is another weak point of JAIN SIP API. I think that it is a responsibility of stack to discover which server transaction should be cancelled, but JAIN SIP API makes it a work for application. But even if there would be a special CancelServerTransaction with method getCancelledTransaction(), this behaviour whould be plagued with threading issues. Thus, such improvement will be useful only if server transactions are created automatically.

At least two implementations (NIST SIP and SIP from OpenCloud) do recognize that server transactions should be created automatically. NIST SIP creates a hidden “prototype” for a transaction. Stack of OpenCloud introduces a method on transaction which removes it. These are bad hacks, because problem should be fixed on API level.

Transactions have method getDialog() which should return a dialog corresponding to that transaction. First, I don’t see any practical reason for this method. And second, what should this method do in case of dialog forking?

Dialogs

I’m more or less satisfied with implementation of dialogs in JAIN SIP API. There are several problems with them, but these problems are not as bad as in other areas.

Existence of separate factory methods for normal requests and ACK is a bad idea. Method for incrementing local sequence number is a bad idea.

Ability to set application context for dialog through setApplicationData() is a good thing. To have this method here is not as bad as for transactions, since dialogs live longer, so application context may change. Of course, changing application context can lead to complex errors because of threading, but it is still a useful thing.

JAIN SIP API doesn’t describe how stack should behave in case of dialog forking. An instance of Dialog which was created by application will be returned with response from first destination. Responses from other destinations will return other dialogs, but how application will recognize them ? There are no event like “DialogForked” so application will know that new dialog is related to existing one.

Layering

JAIN SIP API is not a truly layered API. By looking at it you may think that Dialog.sendRequest() invokes ClientTransaction.sendRequest() which, in turn, invokes SipProvider.sendRequest(). While first may be true, the second can’t be, because both ClientTransaction.sendRequest() and SipProvider.sendRequest() perform the same actions which may modify the request (by exchanging “Route” header and request-URI if “Route” value doesn’t have “;lr”). Thus it is not possible to built your own transaction layer on top of SipProvider. It is also not possible to build your own dialog layer, because server dialogs should move to CONFIRMED state when succesfull response is sent through server transaction, or to TERMINATED state when unsuccesfull response is sent, but there are no means for your dialog to be notified about that.

SipListener and events

Application is notified about incoming messages and changes in state of objects through SipListener callback interface. There are just two methods for processing incoming messages: processRequest() and processResponse() so its an application job to dispatch a processing based on content of events. An implementation of these methods are usually trees of “if/else” operators which analyze transactions, dialogs and application data. A much better way would be to allow setting specific listeners for particular transactions and dialogs. These specific listeners would easily replace all application-provided contexts and will eliminate any dispatch code in application. Thus by applying IoC principle to full extent it is possible to turn stack into good protocol-based framework. (Update: I’ve explained this in more detail in subsequent post)

Conclusion

Let’s summarize all what I’ve said about JAIN SIP API.

Advantages:

  • Syntax objects are separated from the behavioural part
  • Fairly complete
  • Easy to understand and use for simple tasks by developers who like BASIC-style imperative programming
  • Has semantics which is close to RFC
  • Rather easy to implement
  • Since it is not restrictive, it is flexible and extensible

Disadvantages:

  • Parser for messages is not available
  • Stack management is unnesessary complex and restrictive
  • Transactions show implementation details rather and badly implement a protocol context
  • Doesn’t help with productivity
  • Thee ways to send a request. Two ways to send a response.
  • Is not fully complete. For example, doesn’t cover proxying.
  • Doesn’t prevent you from doing mistakes
  • Doesn’t allow you to override some layers
  • It is not a real framework (expanded here)
  • Has holes in specification

Be careful, you have been warned!

In next article I’ll discuss SIP Servlets API.

Changing request URI for in-dialog requests

May 14, 2009

In SIP Servlets API there is a concept of “system headers” which cannot be changed, because it can violate SIP rules. An attempt to change these headers will result in throwing IllegalArgumentException from container. These headers can never be changed. But SIP rules are more complex. For in-dialog requests it is mandatory that request URI and “Route” headers will contain values obtained from dialog state. Thus, methods addHeader(“Route”), setHeader(“Route”), removeHeader(“Route”), pushRoute() and setRequestURI() should throw IllegalStateException for in-dialog requests. Unfortunatelly, it is not specified in SIP Servlets spec. Implementations also don’t fully follow those rules. For example, Sailfin will throw IllegalStateException upon pushRoute(), but will allow changing this header through addHeader(), setHeader() and removeHeader(). It will also allow you to change request URI for in-dialog request. Since SIP Servlet API strives for enforcing SIP rules, these things should be taken into account.

Adding headers through iterator

March 2, 2009

Both JAIN SIP API and SIP Servlets API have method getHeaders(String headerName) on object representing a message (javax.sip.message.Message and javax.servlet.sip.SipServletMessage, respectively). This method returns ListIterator over all headers which have provided name. This iterator is a convenient way to go step-by-step through multi-value headers.

Both APIs provide only very basic means of header manipulations. Method addHeader() adds a value always to the end of the list, and removeHeader() deletes all the headers with same name. Thus, an iterator provides a very convenient way to do more sophisticated actions, like insterting a header value into the middle of the list or removing a header value from the midde of the list. Without iterators, these actions whould require a combination of several get()/add()/remove() operations, complicating a code.

Unfortunatelly, APIs don’t explicitly specify if add()/remove()/set() operations of iterators should be supported. However, it seems that Sailfin and NIST implementation of JAIN SIP API do support it to some extent. Since both implementations have separate classes handling single-value headers (SIPHeader and SingleLineHeader, respectively) and multiple-value headers (SIPHeaderList and MultiLineHeader, respectively) there are also different classes for the iterators. Implemetation of ListIterator for multi-value headers is easy and straightforward, because both implementations use lists as storage of values and just re-use or wrap ListIterator of Java collection framework. Implementations of ListIterator for single-value header are written manually, and they are not full. In case of Sailfin, methods add(), remove() and set() are not implemented. In case of NIST SIP, methods add() and set() are not implemented.

Another interesting question is what should SIP stacks return if there are no headers with provided name ? To be consistent with other cases, a non-null iterator should be returned, having method next() throwing a NoSuchElementException, and having method add() adding new header to the message. After some header was added, iterator should support both navigation with next() and previous() and modification with remove(), add() and set().

NIST SIP returns iterator over empty LinkedList. This iterator supports all modifier methods, but these modifications will be fake: message will not be modified. Sailfin returns iterator of EMPTY_LIST singleton of java.util.Collections. This iterator doesn’t implement modifiers.

Hey you guys other there! You don’t support corner cases properly. Fix up your code, and don’t forget to give me a credit for finding these bugs (just kidding).

Networking in Java: non-blocking NIO, blocking NIO and IO

January 29, 2009

Standard run-time library in Java provides two interfaces for networking. One, which exists in Java since the beginning, is called “basic IO”, because it is based on generic framework of  “input streams” and “output streams” defined in package “java.io”. Sun did a good thing by providing uniform way for accessing files and sockets, following a Unix philosophy. However, there are some drawbacks in stream-based access, so Sun created another set of interfaces located in “java.nio” package. This package also provides uniform access to files and sockets, and is much more flexible than basic IO.

Main problem with basic IO was scalability to number of connections. Operation read() will block until some data will become available. It is not a problem if your program accesses files, because file operations never block for a long time. You are just reading the data until you’ll reach the end of file. Reading after the end of file will immediatelly return with “-1″ bytes read. Another good thing is most programs usually access quite small amount of files. In other words, when working with files it is data who is waiting for program to process it, while program can decide what size of internal buffer to use for processing.

But with networking a model of basic IO is not so convenient. First, read() operation may block an execution thread for a long time. This means that to handle several connections simultaneously you’ll need as many threads as the amount of incoming connections you have. There is a small thing which can help you not to block forewer: you can specify a timeout for socket operations. But it will not solve a scalability problem.

Another problem is related to “message”-based structure of most protocols. Often you don’t know how much data you’ll receive. So, you have to organize your code in a special way:

  • Always read data by one byte, then assemble data array from those bytes. Code is simple, but slow.
  • Read one byte first, then use available() method to determine if there are more data to read. If there are, then read remaining data using bulk operation. Code is more complex, but faster then previous way.

NIO helps you to deal with both these problems. I’ll explain them in a way which seems to me most logical.

First, NIO introduces “Buffers” which are used to combine data and information used to process it. There are also “Channels” which can read into buffers and write from buffers.

To simplify your “basic IO” code you can just call Channels.newChannel() method for your input stream. The resulting channel will implement read() operation which will either block fill provided ByteBuffer with data and moving position to a place right after last byte. This makes code much more simple.

You can avoid wrapping by creating SocketChannel directly. This will get you almost the same result. It is called “blocking NIO”, and I strongly advise using it in simple cases, when thread blocking is not a problem for you.

The only difference between “blocking NIO” and “NIO wrapped around IO” is that you can’t use socket timeout with SocketChannels. Why ? Read a javadoc for setSocketTimeout(). It says that this timeout is used only by streams. However, you can use a trick to make it working:

SocketChannel socketChannel;

socketChannel.socket().setSocketTimeout(500);

InputStream inStream = socketChannel.socket().getInputStream();

ReadableByteChannel wrappedChannel = Channels.newChannel(inStream);

In this example, reading from socketChannel directly will not be interrupted by timeout, but reading from wrappedChannel will be. To find out why it is so, you can take a look inside Java RT library. Socket timeout is used by OS-specific implementation of SocketInputStream, but is is not used by OS-specific implementation of SocketChannel.

However, NIO has much better things to solve a scalability problem. First, you can put a channel into non-blocking mode. This means that read() operation will return immediatelly if there are no data to read. Thus, you can create a single thread which will check all SocketChannels in cycle and read a data if it is available.

Having a single thread is nice, but if it will spin around read() operation it will waste lots of CPU cycles. To help with the performance NIO has a class called “Selector” which WILL block on non-blocking channels. The difference is that it can monitor any amount of channels, resuming execution when at least one of those channels has some readable data. This idea was copied from Unix, but with one big flaw: Selector can use only non-blocking channels.

I don’t know why Sun has introduced this limitation. This article focuses on reading, but both basic IO and NIO also support writing. Since a blocing/non-blocking mode applies both to read and write directions simultaneously, then usage of Selector makes connect() and write() operations more complex. Anyway, it is the only way to have only one thread reading from several network connections.

Let’s finish for today. It’s quite easy to understand what to use. If scalability is an issue, then use “non-blocking NIO”. Otherwise, use “blocking NIO” with thread per connection. You can make those threads as daemons so they will not prevent application from termination when all other threads will stop. Another way to stop those threads is to close channels they are reading from. This will cause a read operation to interrupt with exception.

I hope I’ve shown that NIO is simple. So, don’t use NIO frameworks. They are bad.

SIP stories, part 2: dialog forking

November 5, 2008

When I’ve changed a way my SIP stack deals with dialogs, I’ve encountered with a problem that two unit tests stopped working. These unit tests implemented “corner cases” which are not very well described in RFC 3261, so I implemented them based mostly on my own understanding of SIP. I had two options: to claim that those use cases are wrong and remove them, or to fix my implementation. Obviously, I’ve started with use case analisys.

When “200 OK” response is sent on INVITE, corresponding client and server transactions are instantly terminated on UAC, UAS and all proxies involved. UAS should retransmit “200 OK” until it will receive ACK, but ACK may go in a different path than INVITE. Since reliable delivery of “200 OK” response is specific for UA elements, it is impelemented not in transaction layer, but in UA core of TU layer.

Because of proxy forking, a sender of INVITE can receive dialog-establishing provisional responses from several UASes. When one of them answers with “200 OK” response, the proxy must cancel all other branches. However, some UAS may also respond with “200 OK” before receiving CANCEL. When this second “200 OK” response will arrive at UAC, a client transaction for INVITE will be already terminated. But this response should be delivered to upper layers. RFC 3261 specifies (in chapter 13.2.2.4) that such responses should be matched against ongoing dialogs, and if no matching dialogs are found then new dialog must be constucted. Matching of a response against a dialog has a purpose: to cut off retransmissions of “200 OK” responses. A logic is simple: if response matches a dialog in confirmed state, then it should not be reported to application layer. However, all this idea about matching responses against dialogs has lots of flaws:

  • Described logic means that for any “stray” response UA must construct a dialog, then pass it to application layer. There are no means to check if response corresponds to request actually sent from this node. The idea is that only responses for sent requests should be processed, and processing should stop some time after receiving first “200 OK” response. But there are no means to ensure that.
  • Application is usually interested in context of a response. For example, it may be interested in knowing a request for which response was received. For normal responses a request could be obtained from client transaction. Descibed procedure doesn’t explain how this could be solved.
  • re-INVITEs are sent within existing dialogs. Retransmitted “200 OK” responses on re-INVITE will be always reported to application.

It is clear that RFC 3261 has a big flaw here. In my old implementation I had a workaround. Before sending an INVITE I’ve prepared a “dialog prototype”. All “stray” responses were checked against dialog prototypes by comparing “Call-ID” header, “From” header and URI of “To” header, and if there were a match then I’ve created a dialog based on a prototype. A prototype also held a reference on INVITE, so I could provide a full context for response. This homebrew solution worked, but I firmly decided to follow RFC 3261 to a letter. Searching for ideas I took a look at open source Java implementations of SIP stacks.

NIST implementation of JAIN SIP API matches “stray” responses against ongoing dialogs based only on RFC rules. If some early dialog matches a response, then initial transaction of this dialog is used as a context for response. If response doesn’t match existing dialog, this response is just passed to application layer. So, in this part NIST SIP stack is not compliant to RFC 3261. Some may argue that application can implement this functionality, but I don’t buy it: since there is no a context for response (request is unknown, transaction is null), an application can do very little.

Sailfin implements an interesting hack. They don’t actually terminate client transaction after receiving “200 OK” response. Instead, they transition a transaction into “established” state. This both helps to ensure that response is received for request which actually was sent, and also provides a context for a response at upper layers. Such approach seems to be a very good idea, except that it is not RFC 3261-compliant.

That’s all for today. As I promised to M.Ranganathan, I’ve pointed out problems with RFC 3261 and JAIN SIP API.  In upcoming article I’ll tell about second use case, how I’ve solved these problems, and why Sailfin’s hack is not a hack after all.

Application of HTTP parsers to SIP

July 18, 2008

SIP was intentionally designed to have same syntax as HTTP. Except of familiarity for network engineers, it allows using same parser for both protocols. However, I never saw any case of “common” parser. So I decided to try myself. I’ve took open source HTTP servlet containers, located their HTTP parsers, and tried to apply those parsers to SIP messages.

Jetty

First container was Jetty. I’ve used version 6.1.11. Parser functionality is easily located, because it is encapsulated in class org.mortbay.jetty.HttpParser. This class accepts some buffers for internal use, an EndPoint which is the source of data to parse, and EventHandler which is notified when some message component was parsed. Main method is parseNext(), which tries to parse as much data as available. Reaching end of available data, this method remembers last “state” and last position before returning, so next time it will continue from a place it has stopped. I will call such parser “stateful non-blocking stream-oriented”.

Applying Jetty parser to SIP messages is pretty easy. First, an unparsed message should be wrapped either in ByteArrayEndPoint (if message is in form of byte array) or in StringEndPoint (if message is in form of character sequence). Second, an EventHandler should be implemented, which will store parsed data in object which represents SIP message. At last, an instance of HttpParser should be created and parse() method invoked.

Parser of Jetty is not “complete” and not “strict”. For example, it doesn’t distinguish between requests and responses, it just parses three components of start line. Detailed parsing and validation of all specific headers could be done later.

Code of parser itself is quite complex. Parser works at byte level. It even provides message components as instances of org.mortbay.io.Buffer, which is very similar to java.nio.ByteBuffer. This buffer could be decoded into character format.

Tomcat

Second container was Tomcat. I’ve used version 6.0.16. HTTP parser here is located in component called “coyote”. Class which performes parsing is called org.apache.coyote.http11.InternalInputBuffer. Tomcat uses this class from method process() of class org.apache.coyote.http11.Http11Processor, which accepts Socket as input parameter. Using Http11Processor is not convenient, so I’ve used InternalInputBuffer directly, which is quite simple. Constructor accepts instance of org.apache.coyote.Request which will store parsed data. As a source of data InternalInputBuffer accepts any java.io.InputStream, so I’ve just used ByteArrayInputStream containing whole SIP message. After invoking methods parseRequestLine() and parseHeaders() it is possible to extract all nesessary data from Request, such as method, and headers, then put those data into any suitable form.

When used from Http11Processor, a parser is provided with InputStream of Socket, so read() operation used by parser will block if there are no data. I will call such parser “blocking stream-oriented”. A big drawback of Tomcat is that it isn’t able to parse SIP responses. Parser works on byte level, and all message components are stored in Request class as instances of MessageBytes class. This class performes byte-to-char conversion on demand.

Resin open-source

And finally, Resin open-source version 3.1.6. It is quite simple to locate HTTP parser here. Class is called com.caucho.server.http.HttpRequest, and a particular method is handleRequest(). Constructor of HttpRequest requires an implementation of Connection which is used as source of data. I’ve created a fake connection and assigned a readStream with StringReader which took SIP message in String format. Another option was to use ReaderStream which obtains data through any java.io.Reader (like InputStreamReader). Parser doesn’t do real UTF-8 decoding. Instead, it takes a byte, simply extends it to character, does all parsing using this character, then stores it into object field.

However, it seems to me that parser is not re-usable, because constructor of HttpRequest accepts object of Server class as argument, and I cannot fake this Server.

Conclusion

It is quite possible to use some HTTP parsers for SIP. Jetty is a winner here: you don’t need to modify it (just put a jar file into classpath), it will work perfectly with NIO and amount of code to write is small. The only drawback is that code of parser is quite complex. Parser of Tomcat is simpler, but works only for requests and only with synchronious I/O. Resin is not re-usable without modifications.

Lost layer of SIP

July 10, 2008

SIP is defined by RFC as a layered protocol. This means that all the functionality is grouped into several layers, which have “entry points” where they accept data from other layers. Whole protocol is described as internals of each layer and inter-layer communication. Layers are not organized in strict sequence: for example, transaction user layer sometimes interacts with transport layer directly, like when sending ACK.

Implementations of SIP usually follow layered model. However, since layers defined as “functional”, it is possible to move an implementation of a layer upper or lower, as long as result will be the same. For example, parser layer by definition is located below transport layer. If implementation follows this approach, then it first opens a socket, then encodes a message and writes resulting bytes into that socket, possibly in streaming mode. But other implementations first encode a message, then pass a byte array to common network code which will send those bytes, possibly re-using some existing connection. It is even possible to store a result of encoding in a transaction, thus avoiding encoding a message for each retransmission.

Unfortunatelly, a layered model of SIP is not perfect. It seems to me that there is an additional layer lost in a spec. This article is an attemt to point this fact out, and to show how existing SIP implementations deal with it.

A chapter 12.2.1.1 explains how to populate an outgoing request based on route set of a Dialog. A chapter 8.1.2 of RFC 3261 describes actions UAC should take to send an outgoing request. To find out which URI should be used to define a request destination, it is not enough to have just a request, but it is also should be known if a first element of a route set was a strict or loose router. When a URI is chosen, it should be converted to a list of IP addresses using a procedures described in RFC 3263. After that a request together with IP+port+transport should be passed to transaction layer for sending, unless a request method is ACK. Chapter 8.1.3 describes response processing.

Exactly the same steps are descibed for Proxies in chapter 16.6, steps 6 and 7. My observation is that there should be an additional layer called “Client”. This layer whould be common for proxies and UACs. There are also other pieces of functionality which naturally fit into this layer, like sending a CANCEL.

Sailfin, an open source SIP servlet container, has a layer called “ResolverManager” which is located above TransactionManager and below DialogManager. This layer partially implemens my logical “client layer”. When sending a request it does IP/port/transport resolution, which is stored in ”_remote” field of a request and later used on transport layer. ResolverManager also handles 503 responses by re-trying another IP address, if exists. Treating client timeouts as “408″ responses is done entirely on transaction layer. Strict routing is not supported by Sailfin.

Reference implementation of JAIN SIP API is much more obscure. There are two methods for sending requests: through SipProvider, and through ClientTransaction. Both methods use “Router” to determine IP address. If application doesn’t define any special Router, a DefaultRouter will be used, which does route info post-processing and URI determination. Address resolution is not performed as described in RFC 3263, instead they just use standard Java facility. So, “client layer” here is hidden inside transaction layer. Well, as I said at the beginning of this article, re-ordering of layers is possible if whole picture is the same. But this implementation makes it impossible to implement high availability by trying several client transactions, if hostname has several IP addresses.

I wish RFC 3261 whould be more clear. I also wish JAIN SIP API whould be much better than it is. ClientTransaction interface already suitable to be used as interface for “client layer” , we just need to throw away all methods like getBranchId(), getState(), createAck(). However, a proposal for new, better and cleaner SIP API is a subject for another article. I’m open for suggestions!