Saturday, April 30, 2011

Taking Baby Steps with Node.js–Creating TCP Servers

Here are the links to the previous installments:

  1. Introduction
  2. Threads vs. Events
  3. Using Non-Standard Modules
  4. Debugging with node-inspector
  5. CommonJS and Creating Custom Modules
  6. Node Version Management with n
  7. Implementing Events
  8. BDD Style Unit Tests with Jasmine-Node Sprinkled With Some Should
  9. “node_modules” Folders
  10. Pumping Data Between Streams
  11. Some Node.js Goodies
  12. The Towering Inferno

Just a quick post to show that although creating HTTP servers clearly dominates most code samples that demonstrate the use of Node.js, that it’s also possible to create TCP servers using Node.js.

var net = require(‘net’);

net.createServer(function(socket)) {
    
    socket.on('data', function(data) {
        socket.write(data);
    });
    
}
.listen(2500);

This code sample isn’t very useful in and of itself except for the fact that it demonstrates that it’s just as easy to create a TCP server as it is to create an HTTP server with Node.js.

There! Feeling all better now :-).

Wednesday, April 20, 2011

Taking Baby Steps with Node.js – The Towering Inferno

Here are the links to the previous installments:

  1. Introduction
  2. Threads vs. Events
  3. Using Non-Standard Modules
  4. Debugging with node-inspector
  5. CommonJS and Creating Custom Modules
  6. Node Version Management with n
  7. Implementing Events
  8. BDD Style Unit Tests with Jasmine-Node Sprinkled With Some Should
  9. “node_modules” Folders
  10. Pumping Data Between Streams
  11. Some Node.js Goodies

Because Node.js puts an event-based model at its core, it kindly guides you towards executing all I/O operations asynchronously. As discussed in previous posts, JavaScript perfectly fits this purpose by enabling you to provide a callback that gets executed when a particular I/O operation has finished doing its task. This is all great, but it also involves a mental burden, regarding the readability and maintainability of JavaScript code that can get out of control very quickly. Let’s discuss this further over some code.

http.createServer(function(request, response) {
    var podcastsFilePath =  path.join(__dirname, 'podcasts.txt');
        
    fileSystem.readFile(podcastsFilePath, 'utf8', function(error, originalData) {
        if(error) 
            throw error;
        
        if(-1 == originalData.indexOf('Astronomy Podcast'))    {        
            var favoritePodcasts = originalData + '\n' + 'Astronomy Podcast';    
            fileSystem.writeFile(podcastsFilePath, favoritePodcasts, function(err) {
                if(error) 
                    throw error;
                
                writeResponse(favoritePodcasts);    
            });    
        }
        else {
            writeResponse(originalData);        
        }
    });        
    
    function writeResponse(responseData) {
        response.writeHead(200, {
                'Content-Type': 'text/html', 
                'Content-Length': responseData.length
            });
            
        response.end(responseData);                    
    }
})
.listen(2000);

Here we create a simple HTTP server that reads in a text file when it receives a request. This file simply contains the names of all the podcasts that I personally enjoy listening to. After the content of the file is retrieved, we check whether the Astronomy Podcast is part of the list (yes, it’s that good :-) ). If it’s not in there, the Astronomy Podcast is added to the list of favorite podcasts and the result is written back to the text file. The request is then completed by returning the entire contents of list in the response.

Have a look at the readability of the code that implements this simple example. It’s not that great, if you ask me. And the reason for that is the amount of nested callbacks.

image

These nested callbacks create some sort of horizontal tower effect which makes the event-based approach a bit more cumbersome compared to more traditional programming. But the good news is that we’re able to improve this code by making use of the step library.

Step is a simple control-flow library for Node.js that makes serial execution, parallel execution, and error handling seamless and less painful. The step library simply provides a single function that on its turn accepts any number of functions that are executed in series and in the order that they are provided. Every step in the sequence also knows about the outcome of the previous function.

Let’s take a look at the refactored code for our simple example.

http.createServer(function(request, response) {
    step(
        function assembleFilePath() {
            podcastsFilePath =  path.join(__dirname, 'podcasts.txt');        
            // Because there's no callback needed for this step, 
            // we have to manually call:
            this();        
        },
        
        function readFavoritePodcastsFromFile() {
            fileSystem.readFile(podcastsFilePath, 'utf8', this);            
        },
        
        function addNewFavoritePodcastToFile(err, originalData) {
            if(-1 == originalData.indexOf('Astronomy Podcast'))    {
                favoritePodcasts = originalData + '\n' + 'Astronomy Podcast';    
                fileSystem.writeFile(podcastsFilePath, favoritePodcasts, this);            
            }
            else {
                favoritePodcasts = originalData;
                this();    
            }
        },
        
        function writeResponse() {
            response.writeHead(200, {
                'Content-Type': 'text/html', 
                 'Content-Length': favoritePodcasts.length
            });
            
            response.end(favoritePodcasts);        
        }
    );
})
.listen(2000);

Now this reads a lot nicer, don’t you think?

Using the step library, we were able to refactor the nested callbacks from our first implementation into separate functions that make up a sequence. In the first function we assemble the path for the text file. However, this code doesn’t require a callback, so we need to manually call this() in order to get to the next step in the sequence. I made this an explicit step to show that it’s not required to have a callback for a particular step.

For the second function, we read in the content of the text file. Notice that instead of providing a callback to the readFile() function, we simply provide this.

The third function checks whether the Astronomy Podcast is part of the favorites and if not, the results are written back to the text file. Notice that the parameters of the addNewFavoritePodcastToFile function are exactly the same parameters for the callback that we provided for the readFile() function in our first implementation. This is definitely not a coincidence as the step library ensures that the outcome for the callback of a particular step is provided as input for the next step.

The last function in our sequence simply adds the list to the response object. Also notice that we have the possibility to store data in variables that are accessible in every step of the sequence (e.g. the podcastsFilePath variable in the first function).

Although, executing functions in series is the default, the step library also enables us to execute operations in parallel.

step(
      // Loads two files in parallel
      function loadStuff() {
          fileSystem.readFile(__filename, this.parallel());
          fileSystem.readFile("/etc/passwd", this.parallel());
      },
      // Show the result when done
      function showStuff(error, code, users) {
          if (error) throw error;
          
          console.log(code);
          console.log(users);
      }
);

 

Using the step library we’re really able to take baby steps with Node.js ;-). Until next time.

Tuesday, April 12, 2011

Taking Baby Steps with Node.js – Some Node.js Goodies

Here are the links to the previous installments:

  1. Introduction
  2. Threads vs. Events
  3. Using Non-Standard Modules
  4. Debugging with node-inspector
  5. CommonJS and Creating Custom Modules
  6. Node Version Management with n
  7. Implementing Events
  8. BDD Style Unit Tests with Jasmine-Node Sprinkled With Some Should
  9. “node_modules” Folders
  10. Pumping Data Between Streams

For this post I want to quickly share a couple of goodies that can really improve the overall developer experience when building Node.js applications.

The first one that we’re going to discuss is nodemon. When we start a server process (e.g. $ node server.js), this process isn’t automatically restarted when we make changes to the source files of our node.js application. In order to load these new changes, we first need to manually stop the process and restart it again, which typically involves switching to the command-line before we can actually see the effects of our changes. This is were nodemon steps in, which monitors our Node.js application for changes and automatically restarts the process. This can be a real time saver because we can directly see the effects of our changes without going through the whole stop/restart ceremony.

Installing nodemon is as simple as:

npm install –g nodemon

So instead of using:

node server.js

to start our application, we can now simply use:

nodemon server.js

That’s it! If we make a change to one of the JavaScript source files, nodemon automatically restarts the process.

[nodemon] v0.2.2
[nodemon] running server.js
[nodemon] starting node
[nodemon] reading ignore list
...
[nodemon] restarting due to changes...
[nodemon] ./server.js

nodemon also provides the ability to ignore some specific files, directories or file patterns using an ignore file (nodemon-ignore) that is automatically created in the root directory of the source code the first time the application is started.

# My ignore file

/images/*     # ignore all image files
*.css             # ignore all CSS files

Say that for some odd reason our Node.js applications crashes while handling a request (sounds quite silly, I know, but just hear me out). In this case, nodemon just shows the regular error output and waits until one of the source files is changed again before restarting the application.

[nodemon] app crashed - waiting for file change before starting...

 

The second life-saver that I want to mention here is a library named long-stack-traces. When we’re getting a runtime error in our  application, Node.js generally provides us some very brief error message and stack trace.

ReferenceError: test is not defined
    at Server.<anonymous> (/cygdrive/c/server.js:9:17)
    at Server.emit (events.js:67:17)
    at HTTPParser.onIncoming (http.js:1102:12)
    at HTTPParser.onHeadersComplete (http.js:108:31)
    at Socket.ondata (http.js:1001:22)
    at Socket._onReadable (net.js:675:27)
    at IOWatcher.onReadable [as callback] (net.js:177:10)

Fortunately, long-stack-traces`shows us a more extended stack trace which enables us to find out more quickly which particular line of code that caused the error.

Uncaught ReferenceError: test is not defined
    at Server.<anonymous> (/cygdrive/c/server.js:9:17)
    at Server.emit (events.js:67:17)
    at HTTPParser.onIncoming (http.js:1102:12)
    at HTTPParser.onHeadersComplete (http.js:108:31)
    at Socket.ondata (http.js:1001:22)
    at Socket._onReadable (net.js:675:27)
    at IOWatcher.onReadable [as callback] (net.js:177:10)
----------------------------------------
    at EventEmitter.addListener
    at new Server (http.js:947:10)
    at Object.createServer (http.js:964:10)
    at Object.<anonymous> (/cygdrive/c/server.js:4:6)
    at Module._compile (module.js:404:26)
    at Object..js (module.js:410:10)
    at Module.load (module.js:336:31)
    at Function._load (module.js:297:12)
    at Array.<anonymous> (module.js:423:10)

All you need to do is install the library into the node_modules directory for your application:

npm install long-stack-traces

and simply add:

require('long-stack-traces');

to your code.

There you go. I hope these two goodies can make your Node.js development experience even more pleasant. Until next time.

Wednesday, April 06, 2011

Taking Baby Steps with Node.js – Pumping Data Between Streams

Here are the links to the previous installments:

  1. Introduction
  2. Threads vs. Events
  3. Using Non-Standard Modules
  4. Debugging with node-inspector
  5. CommonJS and Creating Custom Modules
  6. Node Version Management with n
  7. Implementing Events
  8. BDD Style Unit Tests with Jasmine-Node Sprinkled With Some Should
  9. “node_modules” Folders

It’s the 10th blog post already in this series on Node.js! And for this post we’ll be talking about a fairly common scenario when developing applications with Node.js, namely reading data from one stream and sending it to another stream. Suppose we want to develop a simple web application that reads a particular file from disk and send it to the browser. The following code shows a very simple and naïve implementation in order to make this happen.

var http = require('http'),
    fileSystem = require('fs'),
    path = require('path');

http.createServer(function(request, response) {
    var filePath = path.join(__dirname, 'AstronomyCast Ep. 216 - Archaeoastronomy.mp3');
    var stat = fileSystem.statSync(filePath);
    
    response.writeHead(200, {
        'Content-Type': 'audio/mpeg', 
        'Content-Length': stat.size
    });
    
    var readStream = fileSystem.createReadStream(filePath);
    readStream.on('data', function(data) {
        response.write(data);
    });
    
    readStream.on('end', function() {
        response.end();        
    });
})
.listen(2000);

Here we create a stream for reading the data of an mp3 file and writing it to the response stream. When we point our browser to http://localhost:2000, it pretty much behaves as we expect. The mp3 file either starts playing or the browser asks whether the file should be downloaded.

But as I mentioned earlier, this is a pretty naïve implementation. The big issue with this approach is that reading the data from disk through the read stream is usually faster than streaming the data through the HTTP response. So when the data of the mp3 file is read too fast, the write stream is not able to flush the data it is given in a timely manner so it starts buffering this data. For this simple example this is not really a big deal, but if we want to scale this application to handle lots and lots of requests, then having Node.js to compensate for this can imply an intolerable burden for the application.

So, the way to fix this problem is to check whether all the data gets flushed when we send it to the write stream. If this data is being buffered, then we need to pause the read stream. As soon as the buffers are emptied and the write stream gets drained, we can safely resume the data fetching process from the read stream.

var http = require('http'),
    fileSystem = require('fs'),
    path = require('path');

http.createServer(function(request, response) {
    var filePath = path.join(__dirname, 'AstronomyCast Ep. 216 - Archaeoastronomy.mp3');
    var stat = fileSystem.statSync(filePath);
    
    response.writeHead(200, {
        'Content-Type': 'audio/mpeg', 
        'Content-Length': stat.size
    });
    
    var readStream = fileSystem.createReadStream(filePath);
    readStream.on('data', function(data) {
        var flushed = response.write(data);
        // Pause the read stream when the write stream gets saturated
        if(!flushed)
            readStream.pause();
    });
    
    response.on('drain', function() {
        // Resume the read stream when the write stream gets hungry 
        readStream.resume();    
    });
    
    readStream.on('end', function() {
        response.end();        
    });
})
.listen(2000);

This example illustrates a fairly common pattern of throttling data between a read stream and a write stream. This pattern is generally referred to as the “pump pattern”. Because it’s so commonly used, Node.js provides a helper function that takes care of all the goo required to correctly implement this behavior.

var http = require('http'),
    fileSystem = require('fs'),
    path = require('path')
    util = require('util');

http.createServer(function(request, response) {
    var filePath = path.join(__dirname, 'AstronomyCast Ep. 216 - Archaeoastronomy.mp3');
    var stat = fileSystem.statSync(filePath);
    
    response.writeHead(200, {
        'Content-Type': 'audio/mpeg', 
        'Content-Length': stat.size
    });
    
    var readStream = fileSystem.createReadStream(filePath);
    // We replaced all the event handlers with a simple call to util.pump()
    util.pump(readStream, response);
})
.listen(2000);

Using this utility function certainly clears up the code and makes it more readable and easier to understand what is going on, don’t you think? If you’re curious, then you also might want to check out the implementation of the util.pump() function.

So get that data flowing already :-).