I have developed a deeper interest in nodejs recently especially in the area of server applications. Node's HTTP module makes it extremely easy to create small and very fast applications. As a polyglot software developer I have decided to test how node compares to other frameworks. For this I have chosen, rack, Sinatra, node/http and due to my current occupation, Java servlets with Tomcat as the servlet container.
I started the benchmark with Siege and ab but have soon realized that there is something wrong with the world. As it turns out the problem was performance on the side of the request initiator. I have since switched to wrk which does a much better job all round.
Ruby
I started the test with Rack on Thin. It is the framework for Ruby so the idea was that it will be a good reference point for others.
run Proc.new { |env| ['200', {'Content-Type' => 'text/html'}, ['Hello, world!']] } |
Running 10s test @ http://localhost:9292
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.38ms 619.35us 16.05ms 92.73%
Req/Sec 1.42k 200.93 1.59k 95.56%
25527 requests in 10.02s, 3.12MB read
Socket errors: connect 10, read 0, write 0, timeout 0
Requests/sec: 2546.72
Transfer/sec: 318.34KB
I ended up running this test about 10 times with different threading models on Thin and couldn't get it to pass the 3k rps mark
Sinatra
This is the most elegant solution of all. Sinatra is just hands down the cleanest one ever.
require 'sinatra' | |
get "/" do | |
"Hello, world!" | |
end |
Performance-wise one might clearly say that under thin it performs very much OK
Running 10s test @ http://localhost:4567/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.90ms 0.91ms 13.55ms 84.49%
Req/Sec 1.75k 120.16 1.99k 68.00%
34765 requests in 10.01s, 7.43MB read
Requests/sec: 3474.70
Transfer/sec: 760.09KB
I mean almost 3.5k requests per second, nice API to program against... What more can we expect?
Node + http
Having established the baseline now was the time to start poking around nodejs.
var http = require('http'); | |
http.createServer(function(req, res) { | |
res.end("Hello, world!"); | |
}).listen(8080); |
Well, that is also very succinct, even though the asynchronous API might feel a bit weird at the beginning. Performance wise it was exceptionally good!
Running 10s test @ http://localhost:8080
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 627.53us 1.19ms 37.32ms 99.26%
Req/Sec 8.95k 1.06k 18.06k 97.01%
179056 requests in 10.10s, 22.03MB read
Requests/sec: 17729.12
Transfer/sec: 2.18MB
17.7k requests per second! Compared to Ruby it is a 5.1 times better performance!
The platform itself is single threaded so in order to make use of all the CPU power one would simply spin off a few instances on different ports and put a load balancer in front of them. Luckily there is a node module called loadbalancer
which makes the whole experience quite approachable for mere mortals:
var http = require('http'); | |
var port = process.argv[2] || 8081; | |
var message = process.argv[3] || "SERVER-" + port; | |
http.createServer(function(req, res) { | |
res.end(message); | |
}).listen(port); |
{ | |
"stickiness": false, | |
"sourcePort": 8080, | |
"targets": [ | |
{ | |
"host": "127.0.0.1", | |
"port": 8081 | |
}, | |
{ | |
"host": "127.0.0.1", | |
"port": 8082 | |
}, | |
{ | |
"host": "127.0.0.1", | |
"port": 8083 | |
}, | |
{ | |
"host": "127.0.0.1", | |
"port": 8084 | |
} | |
] | |
} |
#!/bin/sh | |
node app.js 8081 & | |
node app.js 8082 & | |
node app.js 8083 & | |
node app.js 8084 & | |
loadbalancer --config config.json start |
Running 10s test @ http://localhost:8080/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 658.55us 0.94ms 17.34ms 95.67%
Req/Sec 9.55k 1.40k 19.58k 83.08%
191007 requests in 10.10s, 23.50MB read
Requests/sec: 18912.15
Transfer/sec: 2.33MB
The setup is way more complex, there are 2 types of applications and all but the gain isn't what I would expect.
So I thought since JavaScript is slow I decided to give haproxy a go. After all it is a specialized application for balancing high-load traffic. I would expect way better performance than the JavaScript-based simplistic load balancer. And so I downloaded the latest sources, built it, configured and ran the tests.
global | |
daemon | |
nbproc 2 | |
maxconn 256 | |
defaults | |
mode http | |
timeout connect 5000ms | |
timeout client 50000ms | |
timeout server 50000ms | |
frontend http-in | |
bind *:8090 | |
default_backend servers | |
backend servers | |
option redispatch | |
server ServerA 127.0.0.1:8081 maxconn 32 | |
server ServerB 127.0.0.1:8082 maxconn 32 | |
server ServerC 127.0.0.1:8084 maxconn 32 | |
server ServerD 127.0.0.1:8085 maxconn 32 | |
And here are the test results
Running 10s test @ http://localhost:8090/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 0.91ms 0.92ms 12.88ms 93.20%
Req/Sec 6.53k 729.78 11.21k 80.60%
130444 requests in 10.10s, 13.06MB read
Requests/sec: 12915.95
Transfer/sec: 1.29MB
What what what??? HAProxy is 30% slower than a proxy written in JavaScript? Can this be true? I'm sure the configuration I came up with can be tuned so I'm going to call this one a draw and move on.
Node has one more option to choose from - it's the cluster. The idea is quite simply to fork the current process, bind the socket to the parent's port and let the parent distribute the load over the spawned children. It's brilliant in that it doesn't add the overhead of making additional proxy request. So it should be really fast!
var cluster = require('cluster'), | |
os = require('os'), | |
http = require('http'); | |
if (cluster.isMaster) { | |
// Spawn as many workers as there are CPUs in the system. | |
for (var i = 0, n = os.cpus().length; i < n; i += 1) { | |
cluster.fork({ SERVER_INDEX: i }); | |
} | |
} else { | |
// Start the application | |
app(); | |
} | |
function app() { | |
http.createServer(function(req, res) { | |
res.end('SERVER-' + process.env.SERVER_INDEX); | |
}).listen(8080); | |
} |
As you can see it is very simple and also very expressive. If you add to it that this is actually the only file in the entire solution it starts to take a really nice shape. My computer has 4 cores so I'll be spawning 4 processes and let them process the requests in a round-robin way. Now let's take a look at the results:
Running 10s test @ http://localhost:8080/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 592.01us 2.58ms 80.36ms 97.85%
Req/Sec 15.96k 2.56k 19.74k 87.62%
320693 requests in 10.10s, 38.54MB read
Requests/sec: 31751.61
Transfer/sec: 3.82MB
Wow!! 31.7k requests per second! Fricken amazing performance! JavaScript's V8 engine rocks! Let's leave it at that.
Java
Now to get a sense of where Node with those 31.7k rps places itself on the landscape I decided to test Java servlets. I didn't went with spark or anything else of that sort since I wanted to compare only the most prominent solutions (or Sinatra since you just can't ignore that extremely beautiful framework).
package org.example; | |
import java.io.IOException; | |
import javax.servlet.ServletException; | |
import javax.servlet.annotation.WebServlet; | |
import javax.servlet.http.HttpServlet; | |
import javax.servlet.http.HttpServletRequest; | |
import javax.servlet.http.HttpServletResponse; | |
@WebServlet(name = "HomeServlet", urlPatterns = { "/" }) | |
public class HomeServlet extends HttpServlet { | |
private static final long serialVersionUID = -2565906365375560143L; | |
@Override | |
protected void doGet( | |
HttpServletRequest request, | |
HttpServletResponse response) | |
throws ServletException, IOException { | |
response.getOutputStream().print("Hello, world!"); | |
} | |
} |
<?xml version="1.0" encoding="UTF-8"?> | |
<project xmlns="http://maven.apache.org/POM/4.0.0" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | |
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | |
<modelVersion>4.0.0</modelVersion> | |
<groupId>org.example</groupId> | |
<artifactId>hello-world</artifactId> | |
<version>1.0</version> | |
<packaging>war</packaging> | |
<name>Hello, world!</name> | |
<dependencies> | |
<dependency> | |
<groupId>javax.servlet</groupId> | |
<artifactId>javax.servlet-api</artifactId> | |
<version>3.0.1</version> | |
<scope>provided</scope> | |
</dependency> | |
</dependencies> | |
<build> | |
<plugins> | |
<plugin> | |
<groupId>org.apache.maven.plugins</groupId> | |
<artifactId>maven-compiler-plugin</artifactId> | |
<version>3.1</version> | |
<configuration> | |
<source>1.7</source> | |
<target>1.7</target> | |
</configuration> | |
</plugin> | |
<plugin> | |
<groupId>org.apache.tomcat.maven</groupId> | |
<artifactId>tomcat7-maven-plugin</artifactId> | |
<version>2.2</version> | |
<configuration> | |
<port>8080</port> | |
<path>/example</path> | |
</configuration> | |
</plugin> | |
</plugins> | |
</build> | |
</project> |
<?xml version="1.0" encoding="UTF-8"?> | |
<web-app xmlns="http://java.sun.com/xml/ns/javaee" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | |
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd" | |
version="3.0"> | |
</web-app> |
As you can see we're doing a Maven project with just one servlet and the web.xml
. Let's see the performance on that baby:
Running 10s test @ http://localhost:8080/example/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.90ms 5.34ms 85.90ms 92.74%
Req/Sec 21.62k 12.25k 43.35k 50.50%
430345 requests in 10.00s, 47.69MB read
Requests/sec: 43028.84
Transfer/sec: 4.77MB
Hold your horses! Yes it is faster but one needs to remember that Java needs time to get to the top performance. So I ran the test once again
Running 10s test @ http://localhost:8080/example/
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 804.67us 1.92ms 32.48ms 91.16%
Req/Sec 31.80k 2.68k 37.12k 77.00%
632656 requests in 10.00s, 70.10MB read
Requests/sec: 63258.42
Transfer/sec: 7.01MB
Now that is just unbelievable! 63.2k requests a second is twice the speed the fastest node solution was capable of yielding! Twice!
Post scriptum
In reality the performance of the platform doesn't really matter all that much. If you take into consideration the response times they all are below 3ms which in turn means that if you make one call to the database you already blew the performance as that is going to cost you way more than just a couple milliseconds. But it is really nice to know the characteristics and to know that performance-wise it really doesn't matter what you choose these days. The framework is going to perform on an acceptable level. Even Sinatra with the 3.5k is still fast enough to serve thousands of requests a minute which is more than enough for most corporate solutions out there.
For a much more complete comparison of many frameworks and platforms check out the techempowered site
Happy coding!
1 comment:
@Andrzej: can you repost your comment, please?
Post a Comment