I have developed a deeper interest in nodejs recently especially in the area of server applications. Node's HTTP module makes it extremely easy to create small and very fast applications. As a polyglot software developer I have decided to test how node compares to other frameworks. For this I have chosen, rack, Sinatra, node/http and due to my current occupation, Java servlets with Tomcat as the servlet container.
I started the benchmark with Siege and ab but have soon realized that there is something wrong with the world. As it turns out the problem was performance on the side of the request initiator. I have since switched to wrk which does a much better job all round.
Ruby
I started the test with Rack on Thin. It is the framework for Ruby so the idea was that it will be a good reference point for others.
Running 10s test @ http://localhost:9292
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.38ms 619.35us 16.05ms 92.73%
Req/Sec 1.42k 200.93 1.59k 95.56%
25527 requests in 10.02s, 3.12MB read
Socket errors: connect 10, read 0, write 0, timeout 0
Requests/sec: 2546.72
Transfer/sec: 318.34KB
I ended up running this test about 10 times with different threading models on Thin and couldn't get it to pass the 3k rps mark
Sinatra
This is the most elegant solution of all. Sinatra is just hands down the cleanest one ever.
Performance-wise one might clearly say that under thin it performs very much OK
Running 10s test @ http://localhost:4567/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.90ms 0.91ms 13.55ms 84.49%
Req/Sec 1.75k 120.16 1.99k 68.00%
34765 requests in 10.01s, 7.43MB read
Requests/sec: 3474.70
Transfer/sec: 760.09KB
I mean almost 3.5k requests per second, nice API to program against... What more can we expect?
Node + http
Having established the baseline now was the time to start poking around nodejs.
Well, that is also very succinct, even though the asynchronous API might feel a bit weird at the beginning. Performance wise it was exceptionally good!
Running 10s test @ http://localhost:8080
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 627.53us 1.19ms 37.32ms 99.26%
Req/Sec 8.95k 1.06k 18.06k 97.01%
179056 requests in 10.10s, 22.03MB read
Requests/sec: 17729.12
Transfer/sec: 2.18MB
17.7k requests per second! Compared to Ruby it is a 5.1 times better performance!
The platform itself is single threaded so in order to make use of all the CPU power one would simply spin off a few instances on different ports and put a load balancer in front of them. Luckily there is a node module called loadbalancer
which makes the whole experience quite approachable for mere mortals:
Running 10s test @ http://localhost:8080/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 658.55us 0.94ms 17.34ms 95.67%
Req/Sec 9.55k 1.40k 19.58k 83.08%
191007 requests in 10.10s, 23.50MB read
Requests/sec: 18912.15
Transfer/sec: 2.33MB
The setup is way more complex, there are 2 types of applications and all but the gain isn't what I would expect.
So I thought since JavaScript is slow I decided to give haproxy a go. After all it is a specialized application for balancing high-load traffic. I would expect way better performance than the JavaScript-based simplistic load balancer. And so I downloaded the latest sources, built it, configured and ran the tests.
And here are the test results
Running 10s test @ http://localhost:8090/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 0.91ms 0.92ms 12.88ms 93.20%
Req/Sec 6.53k 729.78 11.21k 80.60%
130444 requests in 10.10s, 13.06MB read
Requests/sec: 12915.95
Transfer/sec: 1.29MB
What what what??? HAProxy is 30% slower than a proxy written in JavaScript? Can this be true? I'm sure the configuration I came up with can be tuned so I'm going to call this one a draw and move on.
Node has one more option to choose from - it's the cluster. The idea is quite simply to fork the current process, bind the socket to the parent's port and let the parent distribute the load over the spawned children. It's brilliant in that it doesn't add the overhead of making additional proxy request. So it should be really fast!
As you can see it is very simple and also very expressive. If you add to it that this is actually the only file in the entire solution it starts to take a really nice shape. My computer has 4 cores so I'll be spawning 4 processes and let them process the requests in a round-robin way. Now let's take a look at the results:
Running 10s test @ http://localhost:8080/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 592.01us 2.58ms 80.36ms 97.85%
Req/Sec 15.96k 2.56k 19.74k 87.62%
320693 requests in 10.10s, 38.54MB read
Requests/sec: 31751.61
Transfer/sec: 3.82MB
Wow!! 31.7k requests per second! Fricken amazing performance! JavaScript's V8 engine rocks! Let's leave it at that.
Java
Now to get a sense of where Node with those 31.7k rps places itself on the landscape I decided to test Java servlets. I didn't went with spark or anything else of that sort since I wanted to compare only the most prominent solutions (or Sinatra since you just can't ignore that extremely beautiful framework).
As you can see we're doing a Maven project with just one servlet and the web.xml
. Let's see the performance on that baby:
Running 10s test @ http://localhost:8080/example/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.90ms 5.34ms 85.90ms 92.74%
Req/Sec 21.62k 12.25k 43.35k 50.50%
430345 requests in 10.00s, 47.69MB read
Requests/sec: 43028.84
Transfer/sec: 4.77MB
Hold your horses! Yes it is faster but one needs to remember that Java needs time to get to the top performance. So I ran the test once again
Running 10s test @ http://localhost:8080/example/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 804.67us 1.92ms 32.48ms 91.16%
Req/Sec 31.80k 2.68k 37.12k 77.00%
632656 requests in 10.00s, 70.10MB read
Requests/sec: 63258.42
Transfer/sec: 7.01MB
Now that is just unbelievable! 63.2k requests a second is twice the speed the fastest node solution was capable of yielding! Twice!
Post scriptum
In reality the performance of the platform doesn't really matter all that much. If you take into consideration the response times they all are below 3ms which in turn means that if you make one call to the database you already blew the performance as that is going to cost you way more than just a couple milliseconds. But it is really nice to know the characteristics and to know that performance-wise it really doesn't matter what you choose these days. The framework is going to perform on an acceptable level. Even Sinatra with the 3.5k is still fast enough to serve thousands of requests a minute which is more than enough for most corporate solutions out there.
For a much more complete comparison of many frameworks and platforms check out the techempowered site
Happy coding!