Posts Tagged ‘Java’

Using JMX with SSL

December 13, 2017 Leave a comment

Java platform includes JMX technology developed to simplify management of distributed enterprize systems. The idea is that components of enterprize application expose their management-related operations in declarative way, and enterprize container/framework discovers them and collects references in single registry called MBeanServer. Remote access to MBean server is provided by JMX connector to any software which should invoke management actions, such as generic GUI tools like JConsole or VisualVM, or Management consoles provided with enterprize application containers, or special-purpose tools.

JMX technology also available for Java applications which are not running in any container or framework. Java runtime library includes PlatformMBeanServer which is a singleton. However, registering of manageable classes should be done explicitly in your app, since there are no container. Remote access is provided by JMX connector which is configured via system properties usually specified via command-line.

JMX connector works on top of another Java technology called RMI (remote method invocation) which is configured by default to work over secure transport. Using secure transport by default is a strange choice because secure transport will not work itself by default, it requires some configuration. This configuration also could be done via system properties, but it will affect secure transport for whole process, not just JMX connectivity. So, it will not work for you if your Java process has several SSL connections each with specific settings.

So here we will take a closer look to what happens with built-in JMX when you switch it on in your application, and how you can control various aspects of it.

Java documentation includes “Java SE Monitoring and Management Guide” which explains various details of JMX technology. Chapter 2 is called “Monitoring and Management Using JMX Technology” and it describes system properties used to configure built-in JMX agent. There is also a very important part called “Example of Mimicking Out-of-the-Box Management” with a piece of code which will be our base for modifications.

Take a closer look at this code. First, an RMI registry is created, which will live in current Java process. Second, a PlatformMBeanServer is obtained. And, finally, and JMXConnectorServer is created, which is configured to use SslRMIClientSocketFactory and SslRMIServerSocketFactory. So, if we need to customize SSL params, those factories is the place to do that.

In SSL server confirms it’s identity by encoding some value supplied by client using server’s private key, and client checks that by decoding value using server’s public key. Those keys are part of SSL configuration for server and client. In Java those key are usually stored in files. File with private keys is called “KeyStore”, and file with public keys of trusted entities is called “TrustStore”, both files protected with passwords. These files are supplied to instance of SSLContext, which is used to create a SslRMIServerSocketFactory. If you will not supply any SSLContext to SslRMIServerSocketFactory, then a platform-default instance of SSLContext will be used. This instance is configured by system properties called “” and “”, and there are two more properties to supply passwords. There are no default value for keystore of default SSLContext, but there is a default value for truststore of default SSLContext. Default truststore is distributed with Java, and includes public keys of several certificate authorities.

If we will not specify any keystore and will start an application, then you’ll get errors on your server when client will try to connect: “ no cipher suites in common”. This obscure error means that server cannot find any private keys to prove it’s identity.

Since JMX Connector is an SSL server, then you’ll need to obtain a private key (or generate a self-signed key yourself). If you expose JMX to be used by generic GUI tools like JConsole or VisualVM, then you’ll need to generate a public key from your private key using keytool, then configure GUI tools to use truststore containing generated public key.

If you need to add secure JMX to application which already uses SSL, you have following options:
1. If default SSLContext is used by your application, you can add new private key to existing keystore. That’s easy to do, but you will not be able to control which private key will be used by which SSL connection. If this is ok to you, then you don’t need to change any SSL-related properties, which are already present in your application, since they are the only way to configure default SSLContext. This context is by default used by out-of-box JMX connector, so you can throw away all custom JMX code and use only command-line configs.
2. If your application doesn’t use default SSL context, then you can use default SSLContext for JMX only, without a risk to affect anything else. In this case you can (and should!) configure both SSL and JMX via system properties.
3. If you don’t want to use default SSLContext (either because it is already used by your application, or by other reasons), then you can create a separate SSLContext, then configure it with your custom keystore, then use this SSLContext to create a SslRMIServerSocketFactory.

Options 1 and 2 are quite simple and don’t require additional explanations. So we recommend using them if possible, especially option 2. There are quite a lot of command-line keys to specify, but this is a price of flexibility. Option 3 is mostly for those who prefer to have everything explicit, and remainder of article will continue after option 3.

So, after you have customized SslRMIServerSocketFactory, you will probably take a look at SslRMIClientSocketFactory, and you should be surprized to see that you can’t customize it in any way! We will get back to it later.

When you’ll need to troubleshoot your application, you should use a set of system properties to enable RMI logging, and also a set of system properties to enable SSL logging.

With logging enabled you’ll get to situation when everything works, so you can connect to your application via JConsole. However, you’ll have lots of error messages in your server log with following text: “ Received fatal alert: certificate_unknown”. This error will happen in case of using default SSLContext and in case of using dedicated SSLContext. What is going on?

When you start JMX connector, it will try to bind itself to JMX URL in RMI registry. This RMI registry is running locally in the same process, however there are no any shortcuts in RMI, and a network connection is used to communicate with registry. This connection is plain TCP, without any SSL. RMI registry receives not just a serialized remote interface, but also a serialized instance of RMIClientSocketFactory which will be used to make remote calls to this object. Each RMI client obtains a remote reference from RMI Registry together with corresponding RMIClientSocketFactory, and client should not care if underlying transport is plain TCP or SSL, because supplied RMIClientSocketFactory will be used to make connection from client to server. Unfortunatelly, this transport independence works only in theory, because in case of SSL it is a client’s job to configure SSLContext with truststore, because SSLContext is not serialized by SslRMIClientSocketFactory.

If server’s private key is signed with certificate authority which is present in client’s default truststore, then all will work out-of-box without any additional configuration. Otherwize, either a public key should be added to default truststore or special truststore should be created and specified via command-line key. That’s why you need to specify truststore when starting JMX GUI tools.

The same RMIClientSocketFactory is also used by Registry to return result of bind invocation. However, SslRMIClientSocketFactory always uses default SSLContext which in our case is not the same as SSLContext used by SslRMIServerSocketFactory. Server’s private key doesn’t have a public key counterpart in default truststore, that’s why you have this error in log.

If your server’s private key is signed with certificate authority which is present in default truststore, then you are lucky and you will not get this error at all. Another solution is to put server’s public key into truststore of default SSLContext. Yes, this sounds odd, but if you are starting JMX-over-RMI-over-SSL server, then you need to configure not only keystore, but also truststore! But that is not always possible.

If your RMI registry would be started as a separate process, you would have same errors in RMI registry, so truststore had to be configured for RMI registry. So, having RMI registry in the same process doesn’t save you from configuring truststore. Most funny, even if you are using default SSLContext by SslRMIServerSocketFactory, so private key is in keystore for default SSLContext, you still have to put public key into truststore for the same SSLContext! In other words, a process doesn’t trust itself. In my opinion, specifying only private key should be enough to allow loopback connections.

Good news are that in most cases you can switch off debug logging and ignore this error. JMX connector is succesfully bound in RMI registry, and error happens when regstry is trying to return result of this operation back to server. If server doesn’t care about result, then this error can be ignored.

Let’s check if we can solve this problem by providing a different RMIClientSocketFactory which will be serialized including server’s public key? Such solution will also allow us to avoid any configuration for GUI clients! Nice idea, but it will not work. You’ll get errors like: “ Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.lang.ClassNotFoundException: mycustomapp.SslRMIClientSocketFactory”. RMI serialization requires that client has classes to be serialized in it’s classpath. Configuring GUI JMX clients with server’s custom classes is more difficult then configuring truststore. Default SslRMIClientSocketFactory is a part of Java runtime library so it is present in classpath of every JMX GUI tool, that’s why there are no problems with RMI serialization, and that’s why you can’t customize it in any way.

It is very interesting to note that out-of-box JMX agent doesn’t suffer from such problem! You can specify only keystore and there will be no errors in debug logs. How do they do that? Take a look at class where system properties are handled. You’ll see that a special SingleEntryRegistry is created. So, there are no need for RMI call to bind object into such registry.

Categories: Java, Technology Tags: , , ,

Building OpenJDK 8 for Windows using MSYS

August 3, 2015 Leave a comment

This article will describe how to build OpenJDK 8 on Windows using MSYS. Since the building itself is performed by build scripts, we will focus on two things: installation of necessary libraries and compilers, and fixing build scripts, since they don’t work out of the box. As most of my articles, this one is written for my future self, because I’m sure I’ll get back to this task in future, and I don’t like solving problems I know I’ve solved before and I don’t remember how. Also this article is not a simple list of steps, I’ve tried to explain the reasons behind the actions. Readme file of OpenJDK says that “building the source code for the OpenJDK requires a certain degree of technical expertise”, so let’s make ourselves this expertise by learning while doing.

Getting the source.

The first step is to get the source. OpenJDK developers use Mercurial version control system as a code storage. You can open this URL: in browser to see a list of projects hosted by OpenJDK. The project you need is jdk8. If you click on jdk8, you’ll see a list of repositories which jdk8 consists of. The top one is called jdk8, which makes a full URL: You may wonder why there are two jdk8 directories in the URL? This remained from some old times when there were so called “gate” repositories to which changes were pushed for integration, and once those changes were verified, they were merged into read-only main repositories. So, jdk8/jdk8 is a read-only repository. Gate repositories approach was abandoned, but for OpenJDK 8 the path to read-only repository remains the same. If you are curious, you can read more about OpenJDK Mercurial repositories here.

So, let’s get ourselves this respository. You will need Mercurial tools for this. I like GUI tools, so I’ve downloaded SmartGit/Hg. It took me a while to figure out why there are no Mercurial option when you try to clone a remote repository. To make this work, you need to download and install official Mercurial command-line tools, and then go to settings of SmartGit and point it to hg.exe tool. This will make Mercurial to appear in a list of VCSes. Thus, GUI tools are not a full replacement for command-line tools, they just make life a little easier. If you don’t like GUIs, you can skip them and use command-line Mercurial tools, that’s quite easy. So go ahead and clone a repository to some local directory.

Structure of OpenJDK build

The top repository jdk8/jdk8 contains only build infrastructure, it doesn’t contain any real source code, which is sorted into several additional other repositories. So, from this point we can either download those repositories, or we can do that later, when we will prepare everything. Let’s take a second approach, and start with preparing for a build. Take a look at a repository we just cloned. There are two readme files: a short text file and a bigger HTML file. Both are worth reading. Also there are two directories: common and make, and three scripts. The script named will download all remaining sources using Mercurial command-line tools, and we will postpone this until later. Two remaining scripts are the core of build process.

C is not a Java, there are many aspects of the language which are not defined and compiler-specific, like, for example, size of int value. So C programmers achieve portability by having special cases for compiler-dependent things. This is usually done “on source level”: a compiler-specific information is moved to dedicated header file. So to port a C program to another compiler requires changing compiler-dependent info and recompilation. To simplify this task programs used scripts which probe the compiler they are running on and generate a compiler-dependent header files. By convention these scripts are called configure. And OpenJDK has this script. We need to run it at least once. After that we have to use make tool to build everything, becase we have a script for it, called Makefile. Such two-stage comfigure/make approach is standard in Unix world for open-source software.

Let’s take a look at configure file. It is a unix shell script which prepares a build. It is very small, all it does is executing another configure script, located in common/autoconf. This second configure does a little more, like parsing command-line parameters, of which you can read more in readme.html. The main job is done by big script called So, in order to run these scripts we need some Unix-like environment on Windows. There are two options: Cygwin and MSYS. Both environments are quite similar: each provides a shared library (dll) which implements some set of POSIX functions on Windows, and a set of Unix tools compiled as Windows executables, which rely on that dll. Cygwin is bigger, provides a larger set of POSIX calls and includes more Unix tools, so it’s like a complete unix-like environment. MSYS (which means “minimal system”) supports a smaller set of POSIX calls and provides a set of Unix tools just enough to be able to run typical configure scripts. I like everything minimal, so I prefer MSYS.

Installing MSYS and dealing with configure.

MSYS itself is not an independent project, it is a part of another project called MinGW (Minimalist Gnu for Windows), which is a quite interesting story worth telling. Most of the application programs written in C use standard library, and there are many reasons for that. On Unix systems it’s a convenient and portable way to do system calls. Standard library also includes lots of useful functions, like string manipulation. Since standard library relies on OS services, the OS kernel itself cannot use standard library. Windows provides it’s own set of services for applications, called Win32 API, but their compiler suite provides a standard library for compatibility and convenience. Some standard libraries are tied to specific compilers, but there are independent libraries: newlib, uClibc, dietlibc, mucl. When choosing a standard library one has to consider its features, performance, size, support of particular OS/CPU, and also the licence. For example, using library released with GPL requires you to release your program under GPL. The licence terms may be different depending on how you link against a library. There are two options: static linking (library will be included into executable) and dynamic linking. Licensing terms for dynamic linking are usually less restrictive then for static linking. However, if you choose dynamic linking you should somehow ensure that library is installed on computers where your program will run. So, knowing all this we can now get to MingGW. It is a version of GCC compiler which produces Windows executables dynamically linked with standard library supplied with Microsoft Visual C v 6.0 (msvcrt.dll). The license allows any code to dynamically link against it, and practically this library is present in all Windows systems (used by Microsoft’s own applications), so you don’t need to distribute it yourself. Thus MinGW produces executables which can be released under any license and distributed in a very simple way. Technically MinGW consists of a set of header files for standard library, an import library for mscvrt.dll and a version of GCC which produces Windows executables linked with import library. Later some additional libraries were ported to MinGW and now are provided as a part of it. Also MinGW was extended with include files and import libraries for Windows API, so now you can use it to write native Windows software. MinGW made it easier to port software written in C from Unix to Windows, but that was not enough. Thus MSYS was born, it is an environment for running configure scripts.

OK, back to building OpenJDK. Go to MinGW site and download installer. Run it. It will show a list of packages you can install. You don’t actually need MinGW compilers, since they are not used by OpenJDK built, but I advice you to install them. You’ll definitely need make and autoconf. Also you’ll need basic MSYS, and several specific MSYS packages: bsd cpio, mktemp, zip, unzip.

Now, as you have installed MSYS, you can start it’s shell (bash). You can use your windows paths in a special way, for example “C:\projects\openjdk” should be used as “/c/projects/openjdk”. You can try to run configure script right away. At the beginning this script will check availability of required tools, so if you forgot to install abovementioned cpio, mktemp, zip and unzip, then configure will complain (that’s how I learned that I need them). So here we will encounter a first problem with OpenJDK build environment which requires manual intervention. The script will fail finding cpio.

Learning autoconf

The script will fail finding cpio, since it is called bsdcpio. If you’ll try to track the problem (either by looking at source code or by reading log file) you’ll get to a script To fix our problem, we need to modify this script. However, editing it directly is a wrong way. This script is generated (hence the name) by a tool called autoconf from sources located in OpenJDK folder common/autoconf. So, let’s get there and edit the sources. The actual change should be made in file basics.m4. Replace cpio with bsdcpio.

To generate new you should execute But attempt to do it will fail, will complain that it can’t find autoconf. The reason is simple: autoconf was installed into MinGW location which is not available for MSYS by default. So, you should go to MSYS installation directory and find “etc” directory (on my machine it is located at c:\tools\mingw\msys\1.0\etc). Here you should create a file called fstab which will configure mounting of windows directories to msys filesystem. Take a look at fstab.sample to see how to do it, you may even copy it as fstab and edit it. Your task is to map root MinGW folder as /mingw. To apply changes in fstab you should restart MSYS bash. There is another file in etc called profile, which configures bash. By default this profile will add /mingw/bin into search path. So, if you did everything right, the result of “which autoconf” should be something like “/mingw/bin/autoconf”. Now you can get back and use to generate build script. Do it. Oops, another error.

This time autogen will complain that autoconf 2.69 or higher is required. However, MinGW includes version 2.68. When I encountered this error I’ve decided to try with 2.68, and believe me, it works perfectly fine. So, let’s hack OpenJDK build scripts and fix the required version. It is specified in file Again execute This time it should work. Ignore output about no custom hook found.

We just fixed our first configure-stage error, and there will be more. To simplify troubleshooting, you should take a look at file called config.log, which contains output produced by conifugure script. If this log is not verbose enough, you can start the configure with command-line argument –debug-configure. It will make the script to produce additional log called debug-configure.log which is very verbose.

Installing bootstrap JDK.

Large part of JDK is written in Java,including the compiler. So building JDK requires you to have some bootstrap JDK. I’ve never got any problems installing it. You can even install it into default directory, and at any path, even the one which includes spaces.

Having fun with Microsoft Windows 7 SDK.

MinGW provides a C and C++ compilers for Windows, but the only officially supported by OpenJDK is Microsoft Visual C++ compiler, and we are going to use it. Otherwise configure will complain that it cannot find Visual Studio and quit. If you own Visual Studio, that’s great, and you can skip this part. However, in this article I’ll describe how to use minimalist development tools. So, we will use Microsoft Windows 7 SDK, which includes command-line C and C++ compilers from Visual Studio 2010. And it is free! You should download it from official site of Microsoft. There are web installer and several ISO images: for 32-bit systems, for Itanium and for 64-bit systems (amd-64). During the installation you can select which components to install, and I suggest to keep default settings, which include all necessary libraries and the compiler. If you will encounter some problems during the installation, check installation logs for exact description of the failure. I’ve got an error saying that SDK can’t install redistributable runtime libraries. Even de-selecting these libraries in a list of installed components doesn’t help. This happens because you already have a more recent version of those libraries installed (I had version 10.0.40219, and SDK will install 10.0.30319). It’s a shame for Microsoft to keep such bugs in installer. The only workaround is to uninstall your current redistributable of Microsoft Visual C runtime libraries, then install Windows SDK, and then download and install latest version of runtime library.

Now let’s check if compilers are working. If you will skip this part, you may get nasty errors much later. So, go to “c:\Program files (x86)\Microsoft Visual Studio 10.0\VC\bin\amd64” and launch cvtres.exe. If it has started successfully, that’s good. But on some systems it fails with application error. In fact you can skip this error, since it will not manifest at configure stage, but you’ll get strange error messages later on make stage, so let’s fix it now. Careful investigation with Dependency Walker tool shows that cvtres.exe imports a bunch of functions from msvcr100_clr0400.dll, and this dll doesn’t have any exported functions. Actually a version of this library included in SDK is OK, but some update for Microsoft .Net framework overwrites it with no-export version. Nice. In order to fix this, you need to download a patch from Microsoft called Microsoft Visual C++ 2010 Service Pack 1 Compiler Update for the Windows SDK 7.1. It will fix dependency problem for cvtres.exe, it will use another version of runtime dll. Download the update, install it and check that cvtres.exe works.

No, that’s not all. The update we just applied broke another thing. Unbelievable. I’ve created an empty file called ammintrin.h just to get around this annoying thing.

Patching the build scripts

Having Windows SDK will let you get further with configure, but eventually it will fail. That happens because scripts for building OpenJDK 8 using MSYS have errors. These errors were fixed in scripts for OpenJDK 9. The link to fixes could be found in this mail thread. Initial letter from Volker Simonis contains change request, and in subsequent messages Eric Joelsson extended it. Here is a list of changes:

  1. Fix for bsdcpio in basics.m4, which we have already applied
  2. Change in basics_windows.m4, which fixes AC_DEFUN([BASIC_FIXUP_EXECUTABLE_MSYS] problem with configure cannot find set_env.cmd file of Windows SDK
  3. Two fixes in toolchain_windows.m4: one for architecture type, and another with quotes for grep
  4. Fixes in platform.m4 for correct environment value. It’s a supplementary fix for other fixes to work.
  5. Fixes in NativeCompilation.gmk and will help if you’ll have an error during make. Without those fixes you’ll have to clean everything and re-make again from scratch, which takes a lot of time

So we should manually apply those fixes for OpenJDK 8. There are also change in, but you don’t need to apply it. Instead, generate it via autogen.


OpenJDK requires FreeType library. You can build it yourself from sources, I’ve downloaded a pre-built version. However, this pre-build version was strange: it included import library freetype.lib with all functions prefixed with underscore (“_”). To fix this, I’ve created an import library manually from dll using lib tool included in Microsoft Visual C command-line compiler suite (lib.exe /def:freetype6.def). This will produce a file freetype6.lib, which you should rename to freetype.lib, overwriting existing file (I’ve made a backup copy of it called _freetype.lib). You also need to copy freetype6.dll from bin directory in to lib directory and rename it to freetype.dll. And, finally, you need to explicitly specify path to the location where you’ve installed FreeType. A corresponding command-line argument for configure script is called –with-freetype.

Completing the configure

If you’ve done everything right, the configure step will successfully finish. The result will be stored in build directory of OpenJDK. The main item here is specs.gmk. Now you should download modules with source code.


Launch make all. If make hangs edit specs.gmk and set JOBS=1. As a result you’ll get directory called j2sdk-image, that’s your JDK!

Categories: Java, Windows Tags: , , ,

How RMI works

July 17, 2012 Leave a comment

Sometimes the best way to teach someone how to use something is to explain how it works inside. This small explanation on Java RMI was written especially for me so I could quickly restore this knowledge in case I forget.

If you have some object and you want to make it accessible for remote parties, then you have to “export” it into RMI subsystem. RMI will generate some sort of identifier for your object and will store a binding between your object and this identifier inside some storage. When remote party wants to make a call to your object, it will make a connection to your JVM, and will send a protocol message containing object identifier, name of method and parameters in serialized form. RMI subsystem will find an object by identifier, will deserialize parameters and then will perform method invocation by using reflection.

Serialized form of parameters contain their exact class. So even if parameters are declared in method as something abstract, a server first creates instances of their exact class and only then performs upcast. This means that exact classes of parameters should be in server’s classpath.

To perform a communication a remote party should somehow obtain an identifier for exported object. This is solved by making additional lookup. A specific object named “Registry” is bound to some static identifier (let’s call it “1”). This object has his own storage and it allows mapping of other objects to strings. So to obtain a reference to “registered” remote object a client should know a string key which was used during object’s registration. A client  constructs a reference to registry using static identifier “1”, then asks it to return an identifier of registered object.

This double-referencing seems complex. However, it provides some level of protection. Registered objects are “public” and anyone who knows a name can call them. Names by which objects are registered are not secret, and you can query a registry for a list of all names. A method call to a public object may return a remote reference to some “private” object, with randomly generated id which is hard to guess, so it will be available only to method’s caller.

If a server wishes to return a remote reference to an object instead of a serialized copy, then it should export this object to RMI subsystem. The same is true for a client if it provides a callback parameter. UnicastRemoteObject is an object which automatically exports itself in a constructor.

Let’s check if we understand everything by describing a process of registering an object. Registry is often started as a standalone process. If a server wants to register an object, it should first construct a remote interface for registry. Interface itself is known (“java.rmi.Registry”) and located in runtime library. Object identifier is also known, it is static. So server should provide only host and port where RMI registry is running. A server exports his object, then invokes bind() method. RMI understands that argument to remote call was exported, so it sends object identifier and names of classes which are required for remote interface (interface itself, all super-interfaces, all declared parameter classes and all declared return classes). String key is serialized. Now serialized string, identifier of registry object and info about registered object will be sent to registry process. RMI subsystem in registry will create a remote reference with object’s identifier which implements object’s remote interface. Now RMI will locate registry object using registry’s identifier, and will invoke a method bind() to store remote reference together with key. When a client invokes lookup() it connects to registry in a same way as server, and server transfers stored remote reference to client. Now client can connect directly to server and make a call.

The bad thing with RMI is that because of serialization a server should be able to create exact classes of parameter objects, and client should be able to create exact classes of return values. Registry also should know a lot about all registered interfaces. This makes systems build on top of RMI not very flexible. However, there is a way how one side can tell to RMI classloader on the other side about location of classfiles. It is a system property “java.rmi.server.codebase”. To make things easy I’ve written a simple HTTP server which could be deployed in any application which uses RMI, so you will be sure that if it compiles, then it works.

Categories: Java, Technology Tags: ,

A study on Java APIs for SIP. Part 2: SIP Servlets API

May 19, 2009 1 comment

Now let’s take a look on JAIN SIP API‘s younger brother called SIP Servlets API. This guy was ambitious from the birth, so he joined a mob called “JEE” to receive money from big business. But big business never gives money for free, so SIP Servlets API had to cover all behaviour specified for SIP and to provide as large framework as possible, so programmers who work for big business will not bother thinking about protocol details and about execution model. An attempt has failed miserably, because SIP, unlike HTTP, is not about content.

SipServlet interface

First obvious problem of SIP Servlets API is that it is an extension of Generic Servlets API. The cornerstone class of Generic Servlets API is called Servlet, and has only three methods: init(), service() and destroy(). In my opinion, the main problem here is that this class is essentially a too generic framework. Behaviour of container is always protocol-dependent, so servlets are also protocol-dependant. Still, everybody are forced to use this narrow interface for interaction. Are there any convergent servlets which handle several protocols through their service() method ? No, there are convergent applications instead. I think that service() method should be removed from Servlet class. Instead, all concrete servlets would have their own service() method, accepting protocol-specific requests and responses.

I understand that most people don’t bother with problem of downcasting request and response in service() method, so it is not a big deal. I just don’t like it working in this way.

Applications always do some dispatch for incoming messages, so APIs provide some dispatch out-of-box to help application developers. Unlike JAIN SIP API, which has dispatch based on local endpoint and on message type (request/response), SIP Servlets API have built-in dispatch based on request method and response status code. I think that this approach is more useful for applications.


Next bad thing in SIP Servlets API is an existence of SipServletMessage interface. It’s exactly the same case as with JAIN SIP API: this abstraction is not used by anyone. Yes, it is good that signatures of methods in SipServletRequest and SipServletResponse are the same, but nothing will break if these signatures would differ. Authors of SIP Servlet API have ignored the lesson of HTTP Servlets API, which doesn’t have common interface. I understand that HTTP is much more assymmetric then SIP, so in HTTP servlets people are reading headers of requests and writing headers and body to responses, so syntaxic similarity is not related to behaviour. But I still don’t see how syntaxic symmetry of SIP requests and responses could be used in practice.

Syntax part of SIP Servlets API is much smaller and simpler then of JAIN SIP API. Obtaining header will return you a string, adding or changing a header will also accept value as string. Additional parsing is supported only for address headers.


SipServletMessage interface includes not only syntax-related methods but also method send(). Message is a nice context for method send(), because it is the message which should be sent. The fact that method send() belongs to SipServletMessage interface shows that SIP Servlets API doesn’t strive for separation of syntax and behaviour. SipServletMessage is not just a header, start line and body, but it is a gateway to SIP stack, hiding transactions, dialogs and all other protocol layers. This means that you can’t just “forward” incoming message, because it can’t be separated from all internal state. Instead you should either create a new message and manually copy all necessary data from original message into new message, or use hacks provided by API (like proxyTo() or createRequest() method which accepts original request). Thus the amount of interfaces in API is low, but the amount of behavioural methods is large, and their semantics is more complex. However, as long as “message=syntax+state” approach was selected, all methods which implement SIP behaviour are also belong to SipServletRequest or SipServletResponse classes, so message is a context for an action. Such approach is easy to understand by beginners.

Another problem of method send() is that compiler will not complain if you’ll try to invoke it for incoming message. Of course SIP stack will not send such message, you’ll get IllegalStateException in runtime. Thus, SIP Servlets API is not designed to use type system for preventing errors.  A correct solution would be to have separate classes like IncomingRequest, OutgoingRequest, IncomingResponse, OutgoingResponse. There are other behaviour-related methods which could be moved to specific classes instead of throwing an IllegalStateException: createCancel(), createResponse(), createAck(). Some special requests (like ACK and CANCEL) could also be represented by specific classes which would not have methods createResponse().

SIP Servlets API provide large application framework, with different listeners. However, it doesn’t provide a powerfull protocol framework. Instead, you have a SipSession which represents either dialog or proxying session. This SipSession is used for two purposes. First, as a factory for in-dialog requests. Second, as a storage for application context. This means that for every incoming subsequent message an application should restore its context from SipSession object. Such manual contextualization is not a very convenient thing to program, but it allows application server to be distributed and fault-tolerant. Ability to set a servlet which will handle the session can’t be considered as a protocol framework, since that servlet also can’t have its own context and should contextualize itself from SipSession. A little bit more information is given in subsequent post.


SIP Servlets API is not an API for SIP stack. Instead it is an API for application server which means that it tries to be as complete as possible and it is not designed to be extensible. Container should provide all nesessary functionality, and servlets should just contain business logic needed to handle incoming messages. This makes it attactive to beginners who enjoy it’s protocol power. This API will not allow you to violate SIP rules, however it is usually done by throwing exceptions in runtime. Protocol framework is absent, all you have is a SipSession to store and restore your context.

Next article will compare both APIs, will discuss lots of general API-related stuff and will propose better solutions.

Categories: Java, SIP, Telecom Tags: , ,

A study on Java APIs for SIP. Part 1: JAIN SIP API

May 15, 2009 8 comments


This is a first article of the series which will study popular Java APIs for SIP: JAIN SIP API and SIP Servlets API. My intention is to analyze what is good and what is bad, and why it is so. These articles represent my personal opinion, however I’m not just going to tag things as “good” or “bad”. Instead, I’ll try to explain why I like or dislike something. The study will focus on technical aspects of APIs. Since I’m not satisfied with current state of affairs, I also want to propose a better API. Yes, the purpose of this study is to justify the need for another SIP API, because I believe that API is very important.

I don’t think that a long intoduction is needed, so let’s start with JAIN SIP API.


This API is quite curious because it is very maximalistic in implementation of SIP syntax and quite minimalistic in implementation of SIP behaviour. I have a very strong impression that the author (exactly as I did) at the beginning has focused his attention on syntax believing that parser is the only re-usable part of SIP, and all messaging  scenarios are so volatile that they should be implemented in applications. A version of 1.0 of this API covers only a parser together with a stateless sender/receiver. However, in version 1.2 this API also covers some re-usable behavioural layers, such as transaction layer and dialog layer.

Syntax representation

A noticeable feature of JAIN SIP API is that you can work with protocol syntax without a running stack. You just need to obtain a specific factory and then create syntax objects through it.

There are two interfaces which represent two kinds of SIP messages: Request and Response. These interfaces have a common parent interface Message, which contains common syntax-related functionality. I think that existence of Message interface is a bad thing. Occam’s razor should be applied here: RFC 3261 doesn’t say anything about messages in general, so this abstraction is totally unnecessary. Let’s do a simple check: are there any methods of JAIN SIP API which accept Message as parameter or return it as a result? No. This is an example of OOP people to apply abstraction everywhere, and also an implementation detail showing though an API.

Interfaces Request/Message/Response provide methods for obtaining/adding/removing header fields, modify start line and obtaining/setting body. These methods are defined quite well, except of methods for obtaining headers. Instead of single method which accepts header name and returns abstract Header which should be downcasted, I propose having specific methods for each known header, for examle getViaHeader(). This will make code more clear and will involve compiler into error checking.

JAIN SIP API strives to have an interface for every documented SIP header. Interfaces for all headers are descendants of Header interface. This interface has getName() method, which is correct in my opinion, meaning that header name defines a format for header value. Unfortunatelly, there is no getValue() method, such method is available only for ExtensionHeader. This is bad, because for some headers (like ‘Content-Type’, for example) it is often needed to obtain whole value instead of accessing it’s parts.

I don’t have very much to say about Address, URI and SipURI. They are OK.

As you can see, I’m quite satisfied with syntax part of JAIN SIP API. These objects are just data structures reflecting syntax structure of SIP messages.

Stack management: SipStack, SipProvider and ListeningPoint

SIP stack is managed through interface SipStack. Besides lifecycle methods start() and stop(), this interface also has factory methods for ListeningPoints and for SipProviders.These two classes are then supplied to methods add()/remove() to define which network endpoints will be served by which provider.

ListeningPoint is a combo-class for an InetSocketAddress and a String which specifies a name of a transport protocol. This class could be avoided at all. In all methods where ListeningPoint is passed as a parameter it could be replaced with combo of  InetSocketAddress and String. There is a method that returns ListeningPoints, but it can be very well replaced with two methods: one that returns InetSocketAddresses, and another that returns transports for provided InetSocketAddress. So the only point behind this class is to make argument list shorter. I think that it is better to add one or two methods to an existing class then having another useless class.

Interface SipProvider is used to bind a specific SipListener to a specific network endpoint. Thus, JAIN SIP API can only dispatch incoming traffic based on address where it was received. If you can handle all the traffic by single listener – fine, but you still need to create and maintain separate SipProvider for each local endpoint. SipProviders can be added to SipStack or removed from it only while it is stopped. This means that in order to listen on one more port you need to stop listening on all other ports. Such restriction is a very bad idea.

Another responsibility of SipProvider is to be a factory for transactions and dialogs. This implicitly means that all events for these transactions and dialogs will be handled by the listener associated with SipProvider, and all outgoing messages will be sent from the local endpoint associated with this provider. A third responsibility is to serve as a facility for stateless sending of requests and responses.

I suggest that Occam’s razor should be applied to this interface, because it’s responsibilities are vague. It’s essentially a context for a very few things, which could be provided explicitly. All methods of this class could be moved to SipStack, so setListener() method would accept an InetSocketAddress, factory methods for transactions and dialogs whould accept SipListener, and local endpoint for sending whould be choosen automatically.


Client and server transactions are represented by interfaces ClientTransaction and ServerTransaction. These interfaces have common parent interface Transaction, which (very similar to Message) has no apparent use and could also be removed without anybody noticing. ClientTransaction has method for sending request and creating a cancel, ServerTransaction has method for sending responses.

Let’s try to apply Occam’s razor to these classes and see if it is possible to replace them with methods. For ServerTransaction, an answer seems to be no, because sometimes server transactions are created automatically by stack.

Maybe a client transaction can be replaced with method sendStatefully() on SipStack? In case of incoming response or timeout, an application needs a context to handle these events. JAIN SIP API is built in a way that transactions are used as contexts, thus transactions are needed. But, maybe it is possible to replace separate factory method and sending method with just one method, which whould send message statefully and return a transaction object? This whould also eliminate vague problems like: what should stack do if ‘Via’ branch of the request has changed after the creation of transaction? The problem with such apporach is caused by threading issue: an event can happen before a thread which invokes sendStatefully() will retrieve transaction, and this event will be handled by another thread which will not find context for the event. However, this problem can be solved by application providing something as a context instead of re-using transactions for that purpose. Thus, sendStatefully() whould accept a context from application, and use that context in ResponseEvent. Thus, client transactions can be avoided.

In fact, transactions do have some support for  application-provided contexts through methods setApplicationData() and getApplicationData(). They have been introduced as a convenient way to avoid having lookup facilities for context in applications. A good idea, but it is implemented in API in a way that makes application to be written like in BASIC:

10 let transaction=provider.createTransaction(request);

20 transaction.setApplicationData(context);

30 transaction.sendRequest();

instead of single invocation:

stack.sendRequest(request, context);

Since send() is invoked only once, there is no need to have separate setApplicationData() method.

Transactions have getState() method which returns transaction state as defined in RFC. This method is more an implementation detail rather then useful thing. Applications are not really interested in difference between COMPLETED and TERMINATED states. Instead, they are interested in things like canRespond(), canCancel() or requestSent().

So, what exactly are the responsibilities of transaction classes? Implementation details which they expose are not really useful. The answer is that they are more than transactions as they are defined in RFC.

For client transactions method send() includes some functionality which is common for proxies an user agents, such as arranging “Route” headers and determining of remote address. This funcionality in fact belongs to another layer (as described here).

Another purpose of transaction classes is to serve as protocol-level context. For example, it is possible to obtain original request. ClientTransactions has method createCancel(), which should be in MessageFactory.

In my opinion, cancellation in JAIN SIP API is done very badly, in the same BASIC style:

10 Request cancel = clientTran.createCancel();

20 ClientTransaction cancelTran = provider.createTransaction(cancel);

30 cancelTran.sendRequest();

instead of simple:


which is not a subject to errors which can occur because of thread races. I mean: what will happen if a final response will be received between lines 20 and 30 ? JAIN SIP API doesn’t give you an answer. Since stack is a multithreaded module, sending should be done through “atomic” actions, doing several things at once by implementing internal locking.

Another example of not taking threading issues into account is an existence of factory method for server transactions. I strongly believe that server transactions should be always implicitly created for incoming requests (except ACK, of course). Usefulness of stateless proxy is minimal, so attempt to support it in a way that JAIN SIP API does doesn’t justify the problems it brings. For example, what will happen if a retransmission is received while application has delayed a processing of incoming request ? Yes, a retransmission will be processed, and application will get an exception when it will try to create a server transaction for retransmitted request.

Processing of incoming CANCEL is another weak point of JAIN SIP API. I think that it is a responsibility of stack to discover which server transaction should be cancelled, but JAIN SIP API makes it a work for application. But even if there would be a special CancelServerTransaction with method getCancelledTransaction(), this behaviour whould be plagued with threading issues. Thus, such improvement will be useful only if server transactions are created automatically.

At least two implementations (NIST SIP and SIP from OpenCloud) do recognize that server transactions should be created automatically. NIST SIP creates a hidden “prototype” for a transaction. Stack of OpenCloud introduces a method on transaction which removes it. These are bad hacks, because problem should be fixed on API level.

Transactions have method getDialog() which should return a dialog corresponding to that transaction. First, I don’t see any practical reason for this method. And second, what should this method do in case of dialog forking?


I’m more or less satisfied with implementation of dialogs in JAIN SIP API. There are several problems with them, but these problems are not as bad as in other areas.

Existence of separate factory methods for normal requests and ACK is a bad idea. Method for incrementing local sequence number is a bad idea.

Ability to set application context for dialog through setApplicationData() is a good thing. To have this method here is not as bad as for transactions, since dialogs live longer, so application context may change. Of course, changing application context can lead to complex errors because of threading, but it is still a useful thing.

JAIN SIP API doesn’t describe how stack should behave in case of dialog forking. An instance of Dialog which was created by application will be returned with response from first destination. Responses from other destinations will return other dialogs, but how application will recognize them ? There are no event like “DialogForked” so application will know that new dialog is related to existing one.


JAIN SIP API is not a truly layered API. By looking at it you may think that Dialog.sendRequest() invokes ClientTransaction.sendRequest() which, in turn, invokes SipProvider.sendRequest(). While first may be true, the second can’t be, because both ClientTransaction.sendRequest() and SipProvider.sendRequest() perform the same actions which may modify the request (by exchanging “Route” header and request-URI if “Route” value doesn’t have “;lr”). Thus it is not possible to built your own transaction layer on top of SipProvider. It is also not possible to build your own dialog layer, because server dialogs should move to CONFIRMED state when succesfull response is sent through server transaction, or to TERMINATED state when unsuccesfull response is sent, but there are no means for your dialog to be notified about that.

SipListener and events

Application is notified about incoming messages and changes in state of objects through SipListener callback interface. There are just two methods for processing incoming messages: processRequest() and processResponse() so its an application job to dispatch a processing based on content of events. An implementation of these methods are usually trees of “if/else” operators which analyze transactions, dialogs and application data. A much better way would be to allow setting specific listeners for particular transactions and dialogs. These specific listeners would easily replace all application-provided contexts and will eliminate any dispatch code in application. Thus by applying IoC principle to full extent it is possible to turn stack into good protocol-based framework. (Update: I’ve explained this in more detail in subsequent post)


Let’s summarize all what I’ve said about JAIN SIP API.


  • Syntax objects are separated from the behavioural part
  • Fairly complete
  • Easy to understand and use for simple tasks by developers who like BASIC-style imperative programming
  • Has semantics which is close to RFC
  • Rather easy to implement
  • Since it is not restrictive, it is flexible and extensible


  • Parser for messages is not available
  • Stack management is unnesessary complex and restrictive
  • Transactions show implementation details rather and badly implement a protocol context
  • Doesn’t help with productivity
  • Thee ways to send a request. Two ways to send a response.
  • Is not fully complete. For example, doesn’t cover proxying.
  • Doesn’t prevent you from doing mistakes
  • Doesn’t allow you to override some layers
  • It is not a real framework (expanded here)
  • Has holes in specification

Be careful, you have been warned!

In next article I’ll discuss SIP Servlets API.

Categories: Java, SIP, Telecom Tags: , ,

Changing request URI for in-dialog requests

May 14, 2009 Leave a comment

In SIP Servlets API there is a concept of “system headers” which cannot be changed, because it can violate SIP rules. An attempt to change these headers will result in throwing IllegalArgumentException from container. These headers can never be changed. But SIP rules are more complex. For in-dialog requests it is mandatory that request URI and “Route” headers will contain values obtained from dialog state. Thus, methods addHeader(“Route”), setHeader(“Route”), removeHeader(“Route”), pushRoute() and setRequestURI() should throw IllegalStateException for in-dialog requests. Unfortunatelly, it is not specified in SIP Servlets spec. Implementations also don’t fully follow those rules. For example, Sailfin will throw IllegalStateException upon pushRoute(), but will allow changing this header through addHeader(), setHeader() and removeHeader(). It will also allow you to change request URI for in-dialog request. Since SIP Servlet API strives for enforcing SIP rules, these things should be taken into account.

Categories: Java, SIP Tags: , ,

Networking in Java: non-blocking NIO, blocking NIO and IO

January 29, 2009 4 comments

Standard run-time library in Java provides two interfaces for networking. One, which exists in Java since the beginning, is called “basic IO”, because it is based on generic framework of  “input streams” and “output streams” defined in package “”. Sun did a good thing by providing uniform way for accessing files and sockets, following a Unix philosophy. However, there are some drawbacks in stream-based access, so Sun created another set of interfaces located in “java.nio” package. This package also provides uniform access to files and sockets, and is much more flexible than basic IO.

Main problem with basic IO was scalability to number of connections. Operation read() will block until some data will become available. It is not a problem if your program accesses files, because file operations never block for a long time. You are just reading the data until you’ll reach the end of file. Reading after the end of file will immediatelly return with “-1” bytes read. Another good thing is most programs usually access quite small amount of files. In other words, when working with files it is data who is waiting for program to process it, while program can decide what size of internal buffer to use for processing.

But with networking a model of basic IO is not so convenient. First, read() operation may block an execution thread for a long time. This means that to handle several connections simultaneously you’ll need as many threads as the amount of incoming connections you have. There is a small thing which can help you not to block forewer: you can specify a timeout for socket operations. But it will not solve a scalability problem.

Another problem is related to “message”-based structure of most protocols. Often you don’t know how much data you’ll receive. So, you have to organize your code in a special way:

  • Always read data by one byte, then assemble data array from those bytes. Code is simple, but slow.
  • Read one byte first, then use available() method to determine if there are more data to read. If there are, then read remaining data using bulk operation. Code is more complex, but faster then previous way.

NIO helps you to deal with both these problems. I’ll explain them in a way which seems to me most logical.

First, NIO introduces “Buffers” which are used to combine data and information used to process it. There are also “Channels” which can read into buffers and write from buffers.

To simplify your “basic IO” code you can just call Channels.newChannel() method for your input stream. The resulting channel will implement read() operation which will either block fill provided ByteBuffer with data and moving position to a place right after last byte. This makes code much more simple.

You can avoid wrapping by creating SocketChannel directly. This will get you almost the same result. It is called “blocking NIO”, and I strongly advise using it in simple cases, when thread blocking is not a problem for you.

The only difference between “blocking NIO” and “NIO wrapped around IO” is that you can’t use socket timeout with SocketChannels. Why ? Read a javadoc for setSocketTimeout(). It says that this timeout is used only by streams. However, you can use a trick to make it working:

SocketChannel socketChannel;


InputStream inStream = socketChannel.socket().getInputStream();

ReadableByteChannel wrappedChannel = Channels.newChannel(inStream);

In this example, reading from socketChannel directly will not be interrupted by timeout, but reading from wrappedChannel will be. To find out why it is so, you can take a look inside Java RT library. Socket timeout is used by OS-specific implementation of SocketInputStream, but is is not used by OS-specific implementation of SocketChannel.

However, NIO has much better things to solve a scalability problem. First, you can put a channel into non-blocking mode. This means that read() operation will return immediatelly if there are no data to read. Thus, you can create a single thread which will check all SocketChannels in cycle and read a data if it is available.

Having a single thread is nice, but if it will spin around read() operation it will waste lots of CPU cycles. To help with the performance NIO has a class called “Selector” which WILL block on non-blocking channels. The difference is that it can monitor any amount of channels, resuming execution when at least one of those channels has some readable data. This idea was copied from Unix, but with one big flaw: Selector can use only non-blocking channels.

I don’t know why Sun has introduced this limitation. This article focuses on reading, but both basic IO and NIO also support writing. Since a blocing/non-blocking mode applies both to read and write directions simultaneously, then usage of Selector makes connect() and write() operations more complex. Anyway, it is the only way to have only one thread reading from several network connections.

Let’s finish for today. It’s quite easy to understand what to use. If scalability is an issue, then use “non-blocking NIO”. Otherwise, use “blocking NIO” with thread per connection. You can make those threads as daemons so they will not prevent application from termination when all other threads will stop. Another way to stop those threads is to close channels they are reading from. This will cause a read operation to interrupt with exception.

I hope I’ve shown that NIO is simple. So, don’t use NIO frameworks. They are bad.