Raynos, github
Control Flow are techniques for managing asynchronous code.
Control flow is all about having techniques to handle how you write your code and how your code goes from a to b. Normally when people come to node.js one of the big complaints they have is "Dear god spaghetti callback hell".
The way node.js handles continuations with callbacks makes it far too easy to nest. Every time you do anything asynchronous with an anonymous callback you will indent your code
So we want to do a simple task. All you have to do is get a blog post out of the database. That shouldn't be hard right?
Well each one of those tasks leads to having a new callback. And for each new callback your going to get another layer of indenting. Welcome to callback hell
step1(function (value1) {
step2(function (value2) {
step3(function (value3) {
step4(function (value4) {
step5(function (value5) {
step6(function (value6) {
// Do something with value6
});
});
});
});
});
});
The code would literally look like this. Except we are going to have 10 levels of indenting because we did asynchronous tasks. And all that just to load a blog post on a page.
We need to do some real flow control for even the trivial tasks. *Go back to first slide*. If we look at the example of a blog engine using mongodb. You can see 16 levels of indentation. I don't know what you think, but I think that's pretty crazy
What's wrong with it? Well firstly I don't like reading horizontally. Secondly you have closures and inline functions everywhere. And every time you call the function your making a new function. This opens you up to memory leaks and it's just expensive.
My personal style guide says you should in indent three times per function. That's one inner function, a loop and an if. Doing this helps readability and maintainability.
There are multiple times of primitives for flow control. The idea of a primitive is that it's a simple tool that stops you from getting into callback hell. It's a code organization tool
Named functions allow you to use function declarations and extract those inline anonymous functions into external functions. This flattens your callback hierachy and avoids the pyramid of doom
Reference counting allows you to call a continuation callback after n asynchronous tasks have finished. This means you can run n asynchronous tasks in parallel and continue after they have all finished
next functions is the concept of having a list of functions and having a single next function which calls the first function in the list and passes itself in. This allows you to run a function and that function can then say "Ok I'm done, call the next function in the list". It's a method for serial programming
Event emitters give you a way of listening on something. Say you want to be notified when x happens, you can do that if x exposes an event emitter and an event you can listen on. This means that rather then saying "do y then tell me". You can say "if you ever do y, then tell me"
articlesCollection.find({}, { 'sort':[['title', 1]] }, iterateCursors);
function iterateCursors(err, cursor) {
cursor.each(logArticle);
});
function logArticle(err, article) {
if(article != null) {
console.log("[" + article.title + "]");
console.log(">> Closing connection");
db.close();
}
}
Named functions basically envolves turning all your anonymous functions into function declarations. This means there is now only one of your function and you don't create a new one whenever you call the function
it also means you can move the function into an outer scope thus reducing nesting
So rather then indenting your functions directly we take the inline function out, give it a name and put it in a higher scope
But, can anyone see the problem with this code?
Storing data in closure scope causes issues with indenting. This can not be solved by moving them into seperate named functions
closures can cause you problems with your code. If you store state in your closure well then you can't just refactor your inline function into a function declaration becuase it needs that data
This means every function that needs the data has to be created inside your outer function. This can quickly get out of control not to mention it makes memory leaks easier
There are a few alternatives to having data stored in a closure. One of them is to use bind to curry data into a callback. However bind returns a new function so this has a performance penalty
An obvouis solution is to just pass data to your function, but this means you can't pass your functions to someone elses API. You have to wrap that and call your own function with the extra data passed in. This is actually the same as binding because your creating a new function
An alternative solution is to put all your similar functions on an object and bind all those functions to the object. Then just store the data on `this` and pass your callbacks around by `this.foo`. This works nicely because all callbacks always have access to the data through `this` and they are nicely organized on an object, plus you can pass the functions directly to external APIs
var count = post.comments.length,
comments = [];
post.comments.forEach(fetchComment);
function fetchComment(commentId) {
comments.find({ id: commentId }, addToComments);
}
function addToComments(err, result) {
if (err) throw err;
comments.push(result);
if (--count === 0) next();
}
function next() {
/* do stuff with comments */
}
Technique used by TJ in connect(function loop() {
var task = stack.shift();
task && task(someData, proxy);
function proxy(result) {
/* do something with result */
loop();
}
}());
var req = http.request(options);
req.on("response", function (res) {
res.on("data", function (chunk) {
console.log("BODY: " + chunk);
})
});
req.on("error", function (error) {
console.log("oops: ", error);
});
req.end();
var next = after(arr.count, finished);
arr.forEach(doSomethingAsync);
function doSomethingAsync(item) {
somethingAsync(item, next);
}
function finished() {
var results = arguments;
/* do some stuff with your results */
}
After uses reference counting internally and hides the details from you
var files = [],
next = after(1, finished);
fs.readdir(somePath, readFolder);
function readFolder(err, files) {
fs.readdir(loc, function (err, files) {
next.count += files.length;
files.forEach(function (file) {
var filePath = path.join(loc, file);
isFile(file) ? fs.readFile(filePath, readFile) : fs.readdir(filePath, readFolder);
});
next.count--;
});
}
function readFile(err, file) {
files.push(file);
next();
}
after.map(post.comments, mapToComment, finished);
function mapToComment(value, callback) {
comments.find({ id: value }, callback);
}
function finished(err, comments) {
/* do stuff with comments */
}
Set iterations allow you to do something with a value in parallel. This is a great sugar tool
var fs = require("fs"),
exec = require("child_process").exec,
after = require("after"),
stack = Stak.make(function () {
exec('whoami', this.next);
}, function () {
var next = after(2, this.next);
exec('groups', function (err, groups) {
next("groups", groups);
});
fs.readFile(this.file, 'ascii', function (err, file) {
next("file", file);
});
}, function () {
var data = after.unpack(arguments);
console.log("Groups : ", data.groups.trim());
console.log("This file has " data.file.length + " bytes");
});
stack.handle({ file: __filename });