Tuesday, April 1, 2014

Asynchronous package management with NiJS

Last week, I have implemented some additional features in NiJS: an internal DSL for Nix in JavaScript. One of its new features is an alternative formalism to write package specifications and some use cases.

Synchronous package definitions


Traditionally, a package in NiJS can be specified in JavaScript as follows:

var nijs = require('nijs');

exports.pkg = function(args) {
  return args.stdenv().mkDerivation ({
    name : "file-5.11",
    
    src : args.fetchurl()({
      url : new nijs.NixURL("ftp://ftp.astron.com/pub/file/file-5.11.tar.gz"),
      sha256 : "c70ae29a28c0585f541d5916fc3248c3e91baa481f63d7ccec53d1534cbcc9b7"
    }),
    
    buildInputs : [ args.zlib() ],
    
    meta : {
      description : "A program that shows the type of files",
      homepage : new nijs.NixURL("http://darwinsys.com/file")
    }
  });
};

The above CommonJS module exports a function which specifies a build recipe for a package named file, that uses zlib as a dependency and executes the standard GNU Autotools build procedure (i.e. ./configure; make; make install) to build it.

The above module specifies how to build a package, but not which versions or variants of the dependencies that should be used. The following CommonJS module specifies how to compose packages:

var pkgs = {

  stdenv : function() {
    return require('./pkgs/stdenv.js').pkg;
  },

  fetchurl : function() {
    return require('./pkgs/fetchurl').pkg({
      stdenv : pkgs.stdenv
    });
  },

  zlib : function() {
    return require('./pkgs/zlib.js').pkg({
      stdenv : pkgs.stdenv,
      fetchurl : pkgs.fetchurl
    });
  },
  
  file : function() {
    return require('./pkgs/file.js').pkg({
      stdenv : pkgs.stdenv,
      fetchurl : pkgs.fetchurl,
      zlib : pkgs.zlib
    });
  }
}

export.pkgs = pkgs;

As can be seen, the above module includes the previous package specification and provides all its required parameters (such as a variant of the zlib library that we need). Moreover, all its dependencies are composed in the above module as well.

Asynchronous package definitions


The previous modules are synchronous package definitions, meaning that once they are being evaluated nothing else can be done. In the latest version of NiJS, we can also write asynchronous package definitions:

var nijs = require('nijs');
var slasp = require('slasp');

exports.pkg = function(args, callback) {
  var src;
  
  slasp.sequence([
    function(callback) {
      args.fetchurl()({
        url : new nijs.NixURL("ftp://ftp.astron.com/pub/file/file-5.11.tar.gz"),
        sha256 : "c70ae29a28c0585f541d5916fc3248c3e91baa481f63d7ccec53d1534cbcc9b7"
      }, callback);
    },
    
    function(callback, _src) {
      src = _src;
      args.zlib(callback);
    },
    
    function(callback, zlib) {
      args.stdenv().mkDerivation ({
        name : "file-5.11",
        src : src,
        buildInputs : [ zlib ],
    
        meta : {
          description : "A program that shows the type of files",
          homepage : new nijs.NixURL("http://darwinsys.com/file")
        }
      }, callback);
    }
  ], callback);
};

The above module defines exactly the same package as shown earlier, but defines it asynchronously. For example, it does not return, but uses a callback function to pass the evaluation result back to the caller. I have used the slasp library to flatten its structure to make it better readable and maintainable.

Moreover, because packages implement an asynchronous function interface, we also have to define the composition module in a slightly different way:

var pkgs = {

  stdenv : function(callback) {
    return require('./pkgs-async/stdenv.js').pkg;
  },
   
  fetchurl : function(callback) {
    return require('./pkgs-async/fetchurl').pkg({
      stdenv : pkgs.stdenv
    }, callback);
  },
  
  zlib : function(callback) {
    return require('./pkgs-async/zlib.js').pkg({
      stdenv : pkgs.stdenv,
      fetchurl : pkgs.fetchurl
    }, callback);
  },
  
  file : function(callback) {
    return require('./pkgs-async/file.js').pkg({
      stdenv : pkgs.stdenv,
      fetchurl : pkgs.fetchurl,
      zlib : pkgs.zlib
    }, callback);
  }
}

exports.pkgs = pkgs;

Again, this composition module has the same meaning as the one showed earlier, but each object member implements an asynchronous function interface having a callback.

So why are these asynchronous package specifications useful? In NiJS, there are two use cases for them. The first use case is to compile them to Nix expressions and build them with the Nix package manager (which can also be done with synchronous package definitions):

$ nijs-build pkgs-async.js -A file --async
/nix/store/c7zy6w6ls3mfmr9mvzz3jjaarikrwwrz-file-5.11

The only minor difference is that in order to use asynchronous package definitions, we have to pass the --async parameter to the nijs-build command so that they are properly recognized.

The second (and new!) use case is to execute the functions directly with NiJS. For example, we can also use the same composition module to do the following:

$ nijs-execute pkgs-async.js -A file
/home/sander/.nijs/store/file-5.11

When executing the above command, the Nix package manager is not used at all. Instead, NiJS directly executes the build function implementing the corresponding package and all its dependencies. All resulting artifacts are stored in a so-called NiJS store, which resides in the user's home directory, e.g.: /home/sander/.nijs/store.

The latter command does not depend on Nix at all making it possible for NiJS to act as an independent package manager, yet having the most important features that Nix also has.

Implementation


The implementation of nijs-execute is straight forward. Every package directly or indirectly invokes the same function that actually executes a build operation: args.stdenv().mkDerivation(args, callback).

The original implementation for nijs-build (that compiles a JavaScript composition module to a Nix expression) looks as follows:

var nijs = require('nijs');

exports.pkg = {
  mkDerivation : function(args, callback) {
    callback(null, new nijs.NixExpression("pkgs.stdenv.mkDerivation "
        +nijs.jsToNix(args)));
  }
};

To make nijs-execute work, we can simply replace the above implementation with the following:

var nijs = require('nijs');

exports.pkg = {
  mkDerivation : function(args, callback) {
      nijs.evaluateDerivation(args, callback);
  }
};

We replace the generated Nix expression that invokes Nixpkgs' stdenv.mkDerivation {} by a direct invocation to nijs.evaluateDerivation() that executes a build directly.

The evaluateDerivation() translates the first parameter object (representing build parameters) to environment variables. Each key corresponds to an environment variable and each value is translated as follows:

  • A null value is translated to an empty string
  • true translates to "1" and false translates to an empty string.
  • A string, number, or xml object are translated to strings literally.
  • Objects that are instances of the NixFile and NixURL prototypes are also translated to strings literally.
  • Objects instances of the NixInlineJS prototype are converted into a separate builder script, which gets executed by the default builder.
  • Objects instances of the NixRecursiveAttrSet prototype and arbitrary objects are considered derivations that need to be evaluated separately.

Furthermore, evaluateDerivation() invokes a generic builder script with similar features as the one in Nixpkgs:

  • All environment variables are cleared or set to dummy values, such as HOME=/homeless-shelter.
  • It supports the execution of phases. By default, it runs the following phases: unpack, patch, configure, build, install and can be extended with custom ones.
  • By default, it executes a GNU Autotools build procedure: ./configure; make; make install with configurable settings (that have a default fallback value).
  • It can also take custom build commands so that a custom build procedure can be performed
  • It supports build hooks so that the appropriate environment variables are set when providing a buildInputs parameter. By default, the builder automatically sets PATH, C_INCLUDE_PATH and LIBRARY_PATH environment variables. Build hooks can be used to support other languages and environments' settings, such as Python (e.g. PYTHONPATH) and Node.js (e.g. NODE_PATH)

Discussion


Now that NiJS has the ability to act as an independent package manager in addition to serving the purpose of an internal DSL, means that we can deprecate Nix and its sub projects soon and use Nix (for the time being) as a fallback for things that are not supported by NiJS yet.

NiJS has the following advantages over Nix and its sub projects:

  • I have discovered that the Nix expression language is complicated and difficult to learn. Like Haskell, it has a solid theoretical foundation and powerful features (such as laziness), but it's too hard to learn by developers without an academic background.

    Moreover, I had some difficulties accepting JavaScript in the past, but after discovering how to deal with prototypes and asynchronous programming, I started to appreciate it and really love it now.

    JavaScript has all the functional programming abilities that we need, so why should we implement our own language to accomplish the same? Furthermore, many people have proven that JavaScript is the future and we can attract more users if we use a language that more people are familiar with.
  • NiJS also prevents some confusion with a future Linux distribution that is going to be built around it. For most people, it is too hard to make a distinction between Nix and NixOS.

    With NiJS this is not a problem -- NiJS is supposed to be pronounced in Dutch as: "Nice". The future Linux distribution that will be built around it will be called: "NiJSOS", which should be pronounced as "Nice O-S" in Dutch. This is much easier to remember.
  • Same thing holds for Disnix -- Nix sounds like "Nothing" in Dutch and Disnix sounds like: "This is nothing!". This strange similarity has prevented me to properly spread the word to the masses. However "DisniJS" sounds like "This is nice!" which (obviously) sounds much better and is much easier to remember.
  • NiJS also makes continuous integration more scalable than Nix. We can finally get rid of all the annoying Perl code (and the Template Toolkit) in Hydra and reimplement it in Node.js using all its powerful frameworks. Since in Node.js all I/O operations are non-blocking, we can make Hydra even more faster and more scalable.

Conclusion


In this blog post, I have shown that we can also specify packages asynchronously in NiJS. Asynchronous package specifications can be built directly with NiJS, without requiring them to be compiled to Nix expressions that must be built with Nix.

Since NiJS has become an independent package manager and JavaScript is the future, we can deprecate Nix (and its sub projects) soon, since NiJS has significant advantages over Nix.

NiJS can be downloaded from my GitHub page and from NPM. NiJS can also bootstrap itself :-)

Moreover, soon I will create a website, set up mailing lists, create an IRC channel, and define the other sub projects that can be built on top of it.

Follow up


UPDATE: It seems that this blog post has attracted quite a bit of attention today. For example, there has been some discussion about it on the Nix mailing list as well as the GNU Guix mailing list. Apparently, I also made a few people upset :-)

Moreover, a lot readers probably did not notice the publishing date! So let me make it clear:

IT'S APRIL FOOLS' DAY!!!!!!!!!!!!!!!

The second thing you may probably wonder is: what exactly is this "joke" supposed to mean?

In fact, NiJS is not a fake package -- it actually does exists, can be installed through Nix and NPM, and is really capable of doing the stuff described in this blog post (as well the two previous ones).

However, the intention to make NiJS a replacement for Nix was a joke! As a matter of fact, I am a proponent of external DSLs and Nix already does what I need!

Furthermore, only 1% of NiJS' features are actually used by me. For the rest, the whole package is just simply a toy, which I created to explore the abilities of internal DSLs and to explore some "what if" scenarios, no matter how silly they would look :-)

Although NiJS can build packages without reliance on Nix, its mechanisms are extremely primitive! The new feature described in this blog post was basically a silly experiment to develop a JavaScript specification that can be both compiled (to Nix) and interpreted (executed by NiJS directly)!

Moreover, the last few years I have heard a lot of funny, silly, and stupid things, about all kinds of aspects related to Nix, NixOS, Disnix and Node.js which I kept in mind. I (sort of) integrated these things into a story and used a bit of sarcasm as a glue! What these things exactly are is an open exercise for the reader :-).

Tuesday, March 25, 2014

Structured asynchronous programming (Asynchronous programming with JavaScript part 3)

A while ago, I explained that JavaScript execution environments, such as a web browser or Node.js, do not support multitasking. Such environments have a single event loop and when JavaScript code is being executed, nothing else can be done. As a result, it might (temporarily or indefinitely) block the browser or prevent a server from handling incoming connections.

In order to execute multiple tasks concurrently, typically events are generated (such as ticks or timeouts), the execution of the program is stopped so that the event loop can process events, and eventually execution is resumed by invoking the callback function attached to an event. This model works as long as implementers properly "cooperate".

One of its undesired side effects is that code is much harder to structure due to the extensive use of callback functions. Many solutions have been developed to cope with this. In my previous blog posts I have covered the async library and promises as possible solutions.

However, after reading a few articles on the web, some discussion, and some thinking, I came to the observation that asynchronous programming, that is: programming in environments in which executions have to be voluntarily interrupted and resumed between statements and -- as a consequence -- cannot immediately deliver their results within the same code block, is an entirely different programming world.

To me, one of the most challenging parts of programming (regardless of what languages and tools are being used) is being able to decompose and translate problems into units that can be programmed using concepts of a programming language.

In an asynchronous programming world, you have unlearn most of the concepts that are common in the synchronous programming world (to which JavaScript essentially belongs in my opinion) and replace them by different ones.

Are callbacks our new generation's "GOTO statement"?


When I think about unlearning programming language concepts: A classic (and very famous) example that comes into my mind is the "GOTO statement". In fact, a few other programmers using JavaScript claim that the usage of callbacks in JavaScript (and other programming languages as well) are our new generation's "GOTO statement".

Edsger Dijkstra said in his famous essay titled: "A case against the GO TO statement" (published as "Go To Statement Considered Harmful" in the March 1968 issue of the "Communications of the ACM") the following about it:

I become convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except -perhaps- plain machine code)

As a consequence, nearly every modern programming language used these days, lack the GOTO statement and people generally consider it a bad practice to use it. But I have the impression that most of us seem to have forgotten why.

To re-explain Dijkstra's essay a bit in my own words: it was mainly about getting programs correctly implemented by construction. He briefly refers to three mental aids programmers can use (which he explains in more detail in his manuscript titled: "Notes on Structured Programming") namely: enumeration, mathematical induction, and abstraction:

  • The first mental aid: enumeration, is useful to determine the correctness of a code block executing sequential and conditional (e.g. if-then-else or switch) statements.

    Basically, it is about stepping through each statement sequentially and reason whether for each step whether some invariant holds. You could address each step independently with what he describes: "a single textual index".
  • The second mental aid: mathematical induction, comes in handy when working with (recursive) procedures and loops (e.g. while and doWhile loops).

    In his manuscript, he shows that validity of a particular invariant can be proved by looking at the basis (first step of an iteration) first and then generalize the proof to all successive steps.

    For these kinds of proofs, a single textual index no longer suffices to address each step. However, using an additional dynamic index that represents each successive procedure call or iteration step still allows one to uniquely address them. The previous index and this second (dynamic) index constitutes something that he calls "an independent coordinate system".
  • Finally, abstraction (i.e. encapsulating common operations into a procedure) is useful in many ways to me. One of the things Dijkstra said about this is that somebody basically just have to think about "what it does", disregarding "how it works".

The advantage of "an independent coordinate system" is that the value of a variable can be interpreted only with respect to the progress of the process. According to Dijkstra, using the "GOTO statement" makes it quite hard (though not impossible) to define a set of meaningful set of such coordinates, making it harder to reason about correctness and not to make your program a mess.

So what are these coordinates really about you may wonder? Initially, they sound a bit abstract to me, but after some thinking, I have noticed that the way execution/error traces are presented in commonly used programming language these days (e.g. when capturing an exception or using a debugger) use a coordinate system like that IMHO.

These traces have coordinates with two dimensions -- the first dimension is the name of the text file and the corresponding line number that we are currently at (assuming that each line contains a single statement). The second dimension is the stack of function invocations, each showing their corresponding location in the corresponding text files. It also makes sense to me that adding the effects of GOTOs (even when marking each of them with an individual number) to such traces is not helpful, because there could be so many of them that these traces become unreadable.

However, when using structured programming concepts (as described in his manuscript), such as the sequential decomposition, alteration (e.g. if-then-else and switch), and repetition (e.g. while-do, and repeat-until) the first two mental aids can be effectively used to proof validity, mainly because the structure of the program at runtime stays quite close to its static representation.

JavaScript language constructs


Like many other conventional programming languages that are in use these days, the JavaScript programming language supports structured programming language concepts, as well as a couple of other concepts, such as functional programming and object oriented programming through prototypes. Moreover, JavaScript lacks the goto statement.

JavaScript has been originally "designed" to work in a synchronous world, which makes we wonder: what are the effects of using JavaScript's language concepts in an asynchronous world? And are the implications of these effects similar to the effects of using GOTO statements?

Function definitions


The most basic thing one can do in a language such as JavaScript is executing statements, such as variable assignments or function invocations. This is already something that changes when moving from a synchronous world to an asynchronous world. For example, take the following trivial synchronous function definition that simply prints some text on the console:

function printOnConsole(value) {
    console.log(value);
}

When moving to an asynchronous world, we may want to interrupt the execution of the function (yes I know it is not a very meaningful example for this particular case, but anyway):

function printOnConsole(value, callback) {
    process.nextTick(function() {
        console.log(value);
        callback();
    });
}

Because we generate a tick event first when calling the function and then stop the execution, the function returns immediately without doing its work. The callback, that is invoked later, will do it instead.

As a consequence, we do not know when the execution is finished by merely looking when a function returns. Instead, a callback function (provided as a function parameter) can be used, that gets invoked once the work has been done. This is the reason why JavaScript functions in an asynchronous world use callbacks.

As a sidenote: I have seen some people claiming that merely changing the function interface to have a callback, makes their code asynchronous. This is absolutely not true. Code becomes asynchronous if it interrupts and resumes its execution. The callback interface is simply a consequence of providing an equivalent for the return statement that has lost its relevance in an asynchronous world.

Same thing holds for functions that return values, such as the following that translates one numerical digit into a word:

function generateWord(num) {
    var words = [ "zero", "one", "two", "three", "four",
        "five", "six", "seven", "eight", "nine" ];
    return words[num];
}

In asynchronous world, we have to use a callback to pass its result to the caller:

function generateWord(digit, callback) {
    var words;
    process.nextTick(function() {
        words = [ "zero", "one", "two", "three", "four", "five",
            "six", "seven", "eight", "nine" ];
        callback(words[num]);
    });
}

Sequential decomposition


The fact that function interfaces have become different and function invocations have to be done differently, affects all other programming language concepts in JavaScript.

Let's take the simplest structured programming concept: the sequence. Consider the following synchronous code fragment executing a collection of statements in sequential order:

var a = 1;
var b = a + 1;
var number = generateWord(b);
printOnConsole(number); // two

To me it looks straight forward to use enumerative reasoning to conclude that the output shown in the console will be "two".

As explained earlier, in an asynchronous world, we have to pass callback functions as parameters to know when they return. As a consequence, each successive statement has to be executed within the corresponding callback. If we do this in a dumb way, we probably end up writing:

var a = 1;
var b = a + 1;

generateWord(b, function(result) {
    var number = result;
    printOnConsole(number, function() {
        
    }); // two
});

As can be observed in the above code fragment, we end up one indentation level deeper every time we invoke a function, turning the code fragment into pyramid code.

Pyramid code is nasty in many ways. For example, it affects maintenance, because it has become harder to change the order of two statements. It has also become hard to add a statement, say, in the beginning of the code block, because it requires us to refactor all the successive statements. It also becomes a bit harder to read the code because of the nesting and indentation.

However, it also makes me wonder this whether pyramid code is a "new GOTO"? I would say no, because I think we still have not lost our ability to address statements through a "single textual index" and the ability to use enumerative reasoning.

We could also say that the fact that we invoke callback functions for each function invocation introduces the second dynamic index, but on the other hand, we know that a given callback is only called by the same caller, so we can discard that second index because of that.

My conclusion is that we still have enumerative reasoning abilities when implementing a sequence. However, the overhead of each enumeration step is (in my opinion) bigger because we have to keep the indentation and callback nesting into account.

Fortunately, I can create an abstraction to clean up this pyramid code:

function runStatement(stmts, index, callback, result) {
    if(index >= stmts.length) {
        if(typeof callback == "function")
            callback(result);
    } else {
        stmts[index](function(result) {
            runStatement(stmts, index + 1, callback, result);
        }, result);
    }
}

function sequence(stmts, callback) {
    runStatement(stmts, 0, callback, undefined);
}

The above function: sequence() takes an array of functions each requiring a callback as parameter. Each function represents a statement. Moreover, since the abstraction is an asynchronous function itself, we also have to use a callback parameter to notify the caller when it has finished. I can refactor the earlier asynchronous code fragment into the following:

var a;
var b;
var number;

slasp.sequence([
    function(callback) {
        a = 1;
        callback();
    },

    function(callback) {
        b = a + 1;
        callback();
    },
    
    function(callback) {
        generateWord(b, callback);
    },
    
    function(callback, result) {
        number = result;
        printOnConsole(number); // two
    }
]);

By using the sequence() function, we have eliminated all pyramid code, because we can indent the statements on the same level. Moreover, we can also maintain it better, because we do not have to fix the indentation and callback nesting each time we insert or move a statement.

Alteration


The usage of alteration constructs is also slightly different in an asynchronous world. Consider the following example that basically checks whether some variable contains my first name and lets the user know whether this is the case or not:

function checkMe(name) {
    return name == "Sander"
}
    
var name = "Sander";
    
if(checkMe(name)) {
    printOnConsole("It's me!");
    printOnConsole("Isn't it awesome?");
} else {
    printOnConsole("It's someone else!");
}

(As you may probably notice, I intentionally captured the conditional expression in a function, soon it will become clear why).

Again, I think that it will be straight forward to use enumerative reasoning to conclude that the output will be:

It's me!
Isn't it awesome?

When moving to an asynchronous world (which changes the signature of the checkMe() to have a callback) things become a bit more complicated:

function checkMe(name, callback) {
    process.nextTick(function() {
        callback(name == "Sander");
    });
}

var name = "Sander";

checkMe(name, function(result) {
    if(result) {
        printOnConsole("It's me!", function() {
            printOnConsole("Isn't it awesome?");
        });
    } else {
        printOnConsole("It's someone else!");
    }
});

We can no longer evaluate the conditional expression within the if-clause. Instead, we have to evaluate it earlier, then use the callback to retrieve the result and use that to evaluate the if conditional expression.

Although it is a bit inconvenient not being able to directly evaluate a conditional expression, again I still do not think this affect the ability to use enumeration for similar reasons as the sequential decomposition. The above code fragment basically just adds an additional sequential step, nothing more. So in my opinion, we still have not encountered a new GOTO.

Fortunately, I can also create an abstraction for the above pattern:

function when(conditionFun, thenFun, elseFun, callback) {
    sequence([
        function(callback) {
            conditionFun(callback);
        },
        
        function(callback, result) {
            if(result) {
                thenFun(callback);
            } else {
                if(typeof elseFun == "function")
                    elseFun(callback);
                else
                    callback();
            }
        }
    ], callback);
}

and use this function to express the if-statement as follows:

slasp.when(function(callback) {
    checkMe(name, callback);
}, function(callback) {
    slasp.sequence([
        function(callback) {
            printOnConsole("It's me!", callback);
        },
        
        function(callback) {
            printOnConsole("Isn't it awesome?", callback);
        }
    ], callback);
}, function(callback) {
    printOnConsole("It's someone else!", callback);
});

Now I can embed a conditional expression in my artificial when statement.

Same thing applies to the other alteration construct in JavaScript: the switch statement -- you also cannot evaluate a conditional expression directly if it invokes an asynchronous function invocation. However, I can also make an abstraction (which I have called circuit) to cope with that.

Repetition


How are the repetition constructs (e.g. while and do-while) affected in an asynchronous world? Consider the following example implementing a while loop:

function checkTreshold() {
    return (approx.toString().substring(0, 7) != "3.14159");
}

var approx = 0;
var denominator = 1;
var sign = 1;

while(checkTreshold()) {
    approx += 4 * sign / denominator;
    printOnConsole("Current approximation is: "+approx);
        
    denominator += 2;
    sign *= -1;
}

The synchronous code fragment shown above implements the Gregory-Leibniz formula to approximate pi up to 5 decimal places. To reason about its correctness, we have to use both enumeration and mathematical induction. First, we reason that the first two components of the series are correct, then we can use induction to reason that each successive component of the series is correct, e.g. they have an alternating sign, and a denominator increases with 2 for each successive step.

If we move to an asynchronous world, we have a couple of problems, beyond those that are described earlier. First, repetition blocks the event loop for an unknown amount of time so we must interrupt it. Second, if we interrupt a loop, we cannot resume it with a callback. Therefore, we must write our asynchronous equivalent of the previous code as follows:

function checkTreshold(callback) {
    process.nextTick(function() {
        callback(approx.toString().substring(0, 7) != "3.14159");
    });
}

var approx = 0;
var denominator = 1;
var sign = 1;

(function iteration(callback) {
    checkTreshold(function(result) {
        if(result) {
            approx += 4 * sign / denominator;
            printOnConsole("Current approximation is: "+approx, function() {
                denominator += 2;
                sign *= -1;
                setImmediate(function() {
                    iteration(callback);
                });
            });
        }
    });
})();

In the above code fragment, I have refactored the code into a recursive algorithm. Moreover, for each iteration step, I use setImmediate() to generate an event (I cannot use process.nextTick() in Node.js because it skips processing certain kinds of events) and I suspend the execution. The corresponding callback starts the next iteration step.

So is this implication the new GOTO? I would still say no! Even though we were forced to discard the while construct and use recursion instead, we can still use mathematical induction to reason about its correctness, although certain statements are wrapped in callbacks that make things a bit uglier and harder to maintain.

Luckily, I can also capture the above pattern in an abstraction:

function whilst(conditionFun, statementFun, callback) {
    when(conditionFun, function() {
        sequence([
            statementFun,
            
            function() {
                setImmediate(function() {
                    whilst(conditionFun, statementFun, callback);
                });
            }
        ], callback);
    }, callback);
}

The above function (called: whilst) takes three functions as parameters: the first parameter takes a function returning (through a callback) a boolean that represents the conditional expression, the second parameter takes a function that has to be executed for each iteration, and the third parameter is a callback that gets invoked if the repetition has finished.

Using the whilst() function, I can rewrite the earlier example as follows:

var approx = 0;
var denominator = 1;
var sign = 1;

slasp.whilst(checkTreshold, function(callback) {
    slasp.sequence([
        function(callback) {
            approx += 4 * sign / denominator;
            callback();
        },
        
        function(callback) {
            printOnConsole("Current approximation is: "+approx, callback);
        },
        
        function(callback) {
            denominator += 2;
            callback();
        },
        
        function(callback) {
            sign *= -1;
            callback();
        }
    ], callback);
});

The same thing that we have encountered also holds for the other repetition constructs in JavaScript. doWhile is almost the same, but we have to evaluate the conditional expression at the end of each iteration step. We can refactor a for and for-in loop as a while loop, thus the same applies to these constructs as well. For all these constructs I have developed corresponding asynchronous abstractions: doWhilst, from and fromEach.

Exceptions


With all the work done so far, I could already conclude that moving from a synchronous to an asynchronous world (using callbacks) results in a couple of nasty issues, but these issues are definitely not the new GOTO. However, a common extension to structured programming is the use of exceptions, which JavaScript also supports.

What if we expand our earlier example with the generateWord() function to throw an exception if a parameter is given that is not a single positive digit?

function generateWord(num) {
    if(num < 0 || num > 9) {
        throw "Cannot convert "+num+" into a word";
    } else {
        var words = [ "zero", "one", "two", "three", "four", "five",
            "six", "seven", "eight", "nine" ];
        return words[num];
    }
}

try {
    var word = generateWord(1);
    printOnConsole("We have a: "+word);
    word = generateWord(10);
    printOnConsole("We have a: "+word);
} catch(err) {
    printOnConsole("Some exception occurred: "+err);
} finally {
    printOnConsole("Bye bye!");
}

The above code also captures a possible exception and always prints "Bye bye!" on the console regardless of the outcome.

The problem with exceptions in an asynchronous world is basically the same as with the return statement. We cannot just catch an exception because it may not have been thrown yet. So instead of throwing and catching exception, we must simulate them. This is commonly done in Node.js by a introducing another callback parameter called err (that is the first parameter of callback) that is not null if some error has been thrown.

Changing the above function definition to throw errors using this callback parameter is straight forward:

function generateWord(num, callback) {
    var words;
    process.nextTick(function() {
        if(num < 0 || num > 9) {
            callback("Cannot convert "+num+" into a word");
        } else {
            words = [ "zero", "one", "two", "three", "four", "five",
                "six", "seven", "eight", "nine" ];
            callback(null, words[num]);
        }
    });
}

However simulating the effects of a throw, and the catch and finally clauses is not straight forward. I am not going to much into the details (and it's probably best to just just briefly skim over the next code fragment), but this is what I basically what I ended up writing (which is still partially incomplete):

generateWord(1, function(err, result) {
    if(err) {
        printOnConsole("Some exception occured: "+err, function(err) {
            if(err) {
                // ...
            } else {
                printOnConsole("Bye bye!");
            }
        });
    } else {
        var word = result;
        printOnConsole("We have a: "+word, function(err) {
            if(err) {
                printOnConsole("Some exception occurred: "+err, function(err) {
                    if(err) {
                        // ...
                    } else {
                        printOnConsole("Bye bye!");
                    }
                });
            } else {
                generateWord(10, function(err, result) {
                    if(err) {
                        printOnConsole("Some exception occurred: "+err, function(err) {
                            if(err) {
                                // ...
                            } else {
                                printOnConsole("Bye bye!");
                            }
                        });
                    } else {
                        word = result;
                        printOnConsole("We have a: "+word, function(err) {
                            if(err) {
                                printOnConsole("Some exception occurred: "+err, function(err) {
                                    if(err) {
                                        // ...
                                    } else {
                                        printOnConsole("Bye bye!");
                                    }
                                });
                            } else {
                                // ...
                            }
                        });
                     }
                });
            }
        });
    }
});

As you may notice, now the code clearly blows up and you also see lots of repetition because of the fact that we need to simulate the effects of the throw and finally clauses.

To create an abstraction to cope with exceptions, we must adapt all the abstraction functions that I have shown previously to evaluate the err callback parameters. If the err parameter is set to something, we must stop the execution and propagate the err parameter to its callback.

Moreover, I can also define a function abstraction named: attempt, to simulate a try-catch-finally block:

function attempt(statementFun, captureFun, lastlyFun) {
    statementFun(function(err) {
        if(err) {
            if(typeof lastlyFun != "function")
                lastlyFun = function() {};
                    
            captureFun(err, lastlyFun);
        } else {
            if(typeof lastlyFun == "function")
                lastlyFun();
        }
    });
}

and I can rewrite the mess shown earlier as follows:

slasp.attempt(function(callback) {
    slasp.sequence([
        function(callback) {
            generateWord(1, callback);
        },
        
        function(callback, result) {
            word = result;
            printOnConsole("We have a: "+word, callback);
        },
        
        function(callback) {
            generateWord(10, callback);
        },
        
        function(callback, result) {
            word = result;
            printOnConsole("We have a: "+word);
        }
        
    ], callback);
}, function(err, callback) {
    printOnConsole("Some exception occured: "+err, callback);
}, function() {
    printOnConsole("Bye bye!");
});

Objects


Another extension in JavaScript is the ability to construct objects having prototypes. In JavaScript constructors are functions as well as object methods. I think the same applies to these kind of functions just as regular ones -- they cannot return values immediately because they may not have finished their execution yet.

Consider the following example:

function Rectangle(width, height) {
    this.width = width;
    this.height = height;
}

Rectangle.prototype.calculateArea = function() {
    return this.width * this.height;
};

var r = new Rectangle(2, 2);

printOnConsole("Area is: "+r.calculateArea());

The above code fragment simulates a Rectangle class, constructs a rectangle having a width and height of 2, and calculates and displays its area.

When moving to an asynchronous world, we have to take into account all things we did previously. I ended up writing:

function Rectangle(self, width, height, callback) {
    process.nextTick(function() {
        self.width = width;
        self.height = height;
        callback(null);
    });
}

Rectangle.prototype.calculateArea = function(callback) {
    var self = this;
    process.nextTick(function() {
        callback(null, self.width * self.height);
    });
};

function RectangleCons(width, height, callback) {
    function F() {};
    F.prototype = Rectangle.prototype;
    var self = new F();
    Rectangle(self, width, height, function(err) {
        if(err === null)
            callback(null, self);
        else
            callback(self);
    });
}

RectangleCons(2, 2, function(err, result) {
    var r = result;
    r.calculateArea(function(err, result) {
        printOnConsole("Area is: "+result);
    });
});

As can be observed, all functions -- except for the constructor -- have an interface including a callback.

The reason that I had to do something different for the constructor is that functions that are called in conjunction with new cannot propagate this back to the caller without including weird internal properties. Therefore, I had to create a "constructor wrapper" (named: RectangleCons) that first constructs an empty object with the right prototype. After the empty object has been constructed, I invoke the real constructor doing the initialisation work.

Furthermore, the this keyword only works properly within the scope of the constructor function. Therefore, I had to use a helper variable called self to make the properties of this available in the scope of the callbacks.

Writing a "wrapper constructor" is something we ideally do not want to write ourselves. Therefore, I created an abstraction for this:

function novel() {
    var args = Array.prototype.slice.call(arguments, 0);
    
    var constructorFun = args.shift();
    function F() {};
    F.prototype = constructorFun.prototype;
    F.prototype.constructor = constructorFun;
    
    var self = new F();
    args.unshift(self);
    
    var callback = args[args.length - 1];
    args[args.length - 1] = function(err, result) {
        if(err)
            callback(err);
        else
            callback(null, self);
        };
    
        constructorFun.apply(null, args);
    }
}

And using this abstraction, I can rewrite the code as follows:

function Rectangle(self, width, height, callback) {
    process.nextTick(function() {
        self.width = width;
        self.height = height;
        callback(null);
    });
}

Rectangle.prototype.calculateArea = function(callback) {
    var self = this;
    process.nextTick(function() {
        callback(null, self.width * self.height);
    });
};

slasp.novel(Rectangle, 2, 2, function(err, result) {
    var r = result;
    r.calculateArea(function(err, result) {
        printOnConsole("Area is: "+result);
    });
});

When using novel() instead of new, we can conveniently construct objects asynchronously.

As a sidenote: if you want to use simulated class inheritance, you can still use my inherit() function that takes two constructor functions as parameters described in an earlier blog post. They should also work with "asynchronous" constructors.

Discussion


In this blog post, I have shown that in an asynchronous world, functions have to be defined and used differently. As a consequence, most of JavaScript's language constructs are either unusable or have to be used in a different way. So basically, we have to forget about most common concepts that we normally intend to use in a synchronous world, and learn different ones.

The following table summarizes the synchronous programming language concepts and their asynchronous counterparts for which I have directly and indirectly derived patterns or abstractions:

Concept Synchronous Asynchronous
Function interface
function f(a) { ... }
function f(a, callback) { ... }
Return statement
return val;
callback(null, val);
Sequence
a; b; ...
slasp.sequence([
    function(callback) {
        a(callback);
    },
    
    function(callback) {
        b(callback);
    }
    ...
]);
if-then-else
if(condFun())
    thenFun();
else
    elseFun();
slasp.when(condFun,
    thenFun,
    elseFun);
switch
switch(condFun()) {
    case "a":
        funA();
        break;
    case "b":
        funB();
        break;
    ...
}
slasp.circuit(condFun,
    function(result, callback) {
        switch(result) {
            case "a":
                funA(callback);
                break;
            case "b":
                funB(callback);
                break;
            ...
        }
    });
Recursion
function fun() { fun(); }
function fun(callback) {
    setImmediate(function() {
        fun(callback); 
    });
}
while
while(condFun()) {
    stmtFun();
}
slasp.whilst(condFun, stmtFun);
doWhile
do {
    stmtFun();
} while(condFun());
slasp.doWhilst(stmtFun, condFun);
for
for(startFun();
    condFun();
    stepFun()
) {
    stmtFun();
} 
slasp.from(startFun,
    condFun,
    stepFun,
    stmtFun);
for-in
for(var a in arrFun()) {
    stmtFun();
} 
slasp.fromEach(arrFun,
    function(a, callback) {
        stmtFun(callback);
    });
throw
throw err;
callback(err);
try-catch-finally
try {
    funA();
} catch(err) {
    funErr();
} finally {
    funFinally();
}
slasp.attempt(funA,
    function(err, callback) {
        funErr(callback);
    },
    funFinally);
constructor
function Cons(a) {
    this.a = a;
}
function Cons(self, a, callback) {
    self.a = a;
    callback(null);
}
new
new Cons(a);
slasp.novel(Cons, a, callback);

To answer the question whether callbacks are the new GOTO: my conclusion is that they are not the new GOTO. Although they have drawbacks, such as the fact that it becomes harder to read, maintain and adapt code, it does not affect our ability to use enumeration or mathematical induction.

However, if we start using exceptions, then things become way more difficult. Then developing abstractions is unavoidable, but this has nothing to do with callbacks. Simulating exception behaviour in general makes things complicated, which is fueled by the nasty side effects of callbacks.

Another funny observation is that it has become quite common to use JavaScript for asynchronous programming. Since it has been developed for synchronous programming, means that most its constructs are useless. Fortunately, we can cope with that by implementing useful abstractions ourselves (or through third party libraries), but it would be better IMHO that a programming language has the all relevant facilities that are suitable for the domain in which it is going to be used.

Conclusion


In this blog post, I have explained that when moving from a synchronous to an asynchronous world requires forgetting certain programming language concepts and use different asynchronous equivalents.

I have made a JavaScript library out of the abstractions in this blog post (yep, that is yet another abstraction library!), because I think they might come in handy at some point. It is named slasp (SugarLess Asynchronous Structured Programming), because it implements abstractions that are close to the bare bones of JavaScript. It provides no sugar, such as borrowing abstractions from functional programming languages and so on, which most other libraries do.

The library can be obtained from my GitHub page and through NPM and used under the terms and conditions of the MIT license.

Sunday, March 16, 2014

Implementing consistent layouts for websites

Recently, I have wiped the dust off an old dormant project and I have decided to put it on GitHub, since I have found some use for it again. It is a personal project I started a long time ago.

Background


I got the inspiration for this project while working on my bachelor thesis project internship at IBM in 2005. I was developing an application usage analyzer system which included a web front-end implementing their intranet layout. I observed that it was a bit tedious to get it implemented properly. Moreover, I noticed that I had to repeat the same patterns over and over again for each page.

I saw some "tricks" that other people did to cope with these issues, but I considered all of them workarounds -- they were basically a bunch of includes in combination with a bit of iteration to make it work, but looked overly complicated and had all kinds of issues.

Some time before my internship, I learned about the Model-view-controller architectural pattern and I was looking into applying this pattern to the web front-end I was developing.

After some searching on the web using the MVC and Java Enterprise Edition (which was the underlying technology used to implement the system) keywords, I stumbled upon the following JavaWorld article titled: 'Understanding JavaServer Pages Model 2 architecture'. Although the article was specifically about the Model 2 architecture, I considered the Model 1 variant -- also described in the same article -- good enough for what I needed.

I observed that every page of an intranet application looks quite similar to others. For example, they had the same kinds of sections, same style, same colors etc. The only major differences were the selected menu item in the menu section and the contents (such as text) that is being displayed.

I created a model of the intranet layout that basically encodes the structure of the menu section that is being displayed on all pages of the web application. Each item in the menu redirects the user to the same page which -- based on the selected menu option -- displays different contents and a different "active" link. To cite my bachelor's thesis (which was written in Dutch):

De menu instantie bevat dus de structuur van het menu en de JSP zorgt ervoor dat het menu in de juiste opmaak wordt weergegeven. Deze aanpak is gebaseerd is op het Model 1 [model1] architectuur:

which I could translate into something like:

Hence, the menu instance contains the structure of the menu and the JSP is responsible for properly displaying the menu structure. This approach is based on the Model 1 [model1] architecture.

(As a sidenote: The website I am referring to calls "JSP Model 1" an architecture, which I blindly adopted in my thesis. These days, MVC is not something I would call an architecture, but rather an architectural pattern!)

I was quite satisfied with my implementation of the web front-end and some of my coworkers liked the fact that I was capable of implementing the intranet layout completely on my own and to be able to create and modify pages so easily.

Creating a library


After my internship, I was not too satisfied with the web development work I did prior to it. I had developed several websites and web applications that I still maintained, but all of them were implemented in an ad-hoc way -- one web application had a specific aspect implemented in a better way than others. Moreover, I kept reimplementing similar patterns over and over again including layout elements. I also did not reuse code effectively apart from a bit of copying and pasting.

From that moment on, I wanted everything that I had to develop to have the same (and the best possible) quality and to reuse as much code as possible so that every project would benefit from it.

I started a new library project from scratch. In fact, it were two library projects for two different programming languages. Initially I started implementing a Java Servlet/JSP version, since I became familiar with it during my internships at IBM and I considered it to be good and interesting technology to use.

However, all my past projects were implemented in PHP and also most of the web applications I maintained were hosted at shared webhosting providers only supporting PHP. As a result, I also developed a PHP version which became the version that I actually used for most of the time.

I could not use any code from my internship. Apart from the fact that it was IBM's property, it was also too specific for IBM intranet layouts. Moreover, I needed something that was even more general and more flexible so that I could encode all the layouts that I had implemented myself in the past. However, I kept the idea of the Model-1 and Model-2 architectural patterns that I discovered in mind.

Moreover, I also studied some usability heuristics (provided by the Nielsen-Norman Group) which I tried to implement in the library:

  • Visibility of system status. I tried supporting this aspect, by ensuring that the selected links in the menu section were explicitly marked as such so that users always know where they are in the navigation structure.
  • The "Consistency and standards" aspect was supported by the fact that every page has the same kinds of sections with the same purposes. For example, the menu sections have the same behavior as well as the result of clicking on a link.
  • I tried support "Error prevention" by automatically hiding menu links that were not accessible.

I kept evolving and improving the libraries until early 2009. The last thing I did with it was implementing my own personal homepage, which is still up and running today.

Usage


So how can these libraries be used? First, a model has to be created which captures common layout properties and the sub pages of which the application consists. In PHP, a simple application model could be defined as follows:

<?php
$application = new Application(
    /* Title */
    "Simple test website",

    /* CSS stylesheets */
    array("default.css"),

    /* Sections */
    array(
        "header" => new StaticSection("header.inc.php"),
        "menu" => new MenuSection(0),
        "submenu" => new MenuSection(1),
        "contents" => new ContentsSection(true)
    ),

    /* Pages */
    new StaticContentPage("Home", new Contents("home.inc.php"), array(
        "page1" => new StaticContentPage("Page 1", new Contents("page1.inc.php"), array(
            "page11" => new StaticContentPage("Subpage 1.1",
                new Contents("page1/subpage11.inc.php")),
            "page12" => new StaticContentPage("Subpage 1.2",
                new Contents("page1/subpage12.inc.php")),
            "page13" => new StaticContentPage("Subpage 1.3",
                new Contents("page1/subpage13.inc.php")))),
            ...
    )))
);

The above code fragment specifies the following:

  • The title of the entire web application is: "Simple test website", which will be visible in the title bar of the browser window for every sub page.
  • Every sub page of the application uses a common stylesheet: default.css
  • Every sub page has the same kinds of sections:
    • The header section always displays the same (static) content which code resides in a separate PHP include (header.inc.php)
    • The menu section displays a menu navigation section displaying links reachable from the entry page.
    • The submenu section displays a menu navigation section displaying links reachable from the pages in the previous menu section.
    • The contents section displays the actual dynamic contents (usually text) that makes the page unique based on the link that has been selected in one of the menu sections.
  • The remainder of the code defines the sub pages of which the web application consists. Sub pages are organised in a tree-like structure. The first object is entry page, the entry page has zero or more sub pages. Each sub page may have sub pages on their own, and so on.

    Every sub page provides their own contents to be displayed in contents section that has been defined earlier. Moreover, the menu sections automatically display links to the reachable sub pages from the current page that is being displayed.

By calling the following view function, with the application model as parameter we can display any of its sub pages:
displayRequestedPage($application);
?>

The above function generates a basic HTML page. The title of the page is composed of the application's title and the selected page title. Moreover, the sections are translated to div elements having an id attribute set to their corresponding array key. Each of these divs contains the contents of the include operations. The sub page selection is done by taking the last few path components of the URL that come after the script component.

If I create a "fancy" stylesheet, a bit of basic artwork and some actual contents for each include, something like this could appear on your screen:


Although the generated HTML by displayRequestedPage() is usually sufficient, I could also implement a custom one if I want to do more advanced stuff. I decomposed most if its aspects in sub functions that can be easily invoked from a custom function that does something different.

I also create a Java version of the same thing, which predates the PHP version. In the Java version, the model would look like this:

package test;

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import io.github.svanderburg.layout.model.*;
import io.github.svanderburg.layout.model.page.*;
import io.github.svanderburg.layout.model.page.content.*;
import io.github.svanderburg.layout.model.section.*;

public class IndexServlet extends io.github.svanderburg.layout.view.IndexServlet
{
    private static final long serialVersionUID = 6641153504105482668L;

    private static final Application application = new Application(
        /* Title */
        "Test website",

        /* CSS stylesheets */
        new String[] { "default.css" },

        /* Pages */
        new StaticContentPage("Home", new Contents("home.jsp"))
            .addSubPage("page1", new StaticContentPage("Page 1", new Contents("page1.jsp"))
                .addSubPage("subpage11", new StaticContentPage("Subpage 1.1",
                    new Contents("page1/subpage11.jsp")))
                .addSubPage("subpage12", new StaticContentPage("Subpage 1.2",
                    new Contents("page1/subpage12.jsp")))
                .addSubPage("subpage13", new StaticContentPage("Subpage 1.3",
                    new Contents("page1/subpage13.jsp"))))
        ...
    )
    /* Sections */
    .addSection("header", new StaticSection("header.jsp"))
    .addSection("menu", new MenuSection(0))
    .addSection("submenu", new MenuSection(1))
    .addSection("contents", new ContentsSection(true));

    protected void doGet(HttpServletRequest req, HttpServletResponse resp)
        throws ServletException, IOException
    {
        dispatchLayoutView(application, req, resp);
    }

    protected void doPost(HttpServletRequest req, HttpServletResponse resp)
        throws ServletException, IOException
    {
        dispatchLayoutView(application, req, resp);
    }
}

As may be observed, since Java is statically typed language, more code is needed to express the same thing. Furthermore, Java has no associative arrays in its language, so I decided to use fluent interfaces instead.

Moreover, the model is also embedded in a Java Servlet, that dispatches the requests to a JSP page (WEB-INF/index.jsp) that represents the view. This JSP page could be implemented as follows:

<%@ page language="java" contentType="text/html; charset=UTF-8"
    pageEncoding="UTF-8" import="io.github.svanderburg.layout.model.*"
    import="io.github.svanderburg.layout.model.page.*,test.*"%>
<%
Application app = (Application)request.getAttribute("app");
Page currentPage = (Page)request.getAttribute("currentPage");
%>
<%@ taglib uri="http://svanderburg.github.io" prefix="layout" %>
<layout:index app="<%= app %>" currentPage="<%= currentPage %>" />

The above page takes the application model and the current page (determined by the URL to call it) as requests parameters. It invokes the index taglib (instead of a function in PHP) to compose an HTML page from it. Moreover, I have also encoded sub parts of the index page as reusable taglibs.

Other features


Besides the simple usage scenario shows earlier, the libraries support a collection of other interesting features, such as:

  • Multiple content section support
  • Per-page style and script includes
  • Error pages
  • Security handling
  • Controller sections to handle GET or POST parameters. In Java, you can invoke Java Servlets to do this, making the new library technically compliant with the JSP Model-2 architectural pattern.
  • Using path components as parameters
  • Internationalised sub pages

Conclusion


In this blog post, I have described an old dormant project that I revived and released. I always had the intention to release it as free/open-source software in the past, but never actually did it until now.

These days, some people do not really consider me a "web guy". I was very active in this domain a long time ago, but I (sort of) put that interest into the background, although I am still very much involved with web application development today (in addition to software deployment techniques and several other interests).

This interesting oatmeal comic clearly illustrates one of the major reasons why I have put my web technology interests into the background. This talk about web technology from Zed Shaw has an overlap with my other major reason.

Today, I am not so interested anymore in making web sites for people or to make this library a killer feature, but I don't mind sharing code. The only thing I care about at this moment is to use it to please myself.

Availability


The Java (java-sblayout) as well as the PHP (php-sblayout) versions of the libraries can be obtained from my GitHub page and used under the terms and conditions of the Apache Software License version 2.0.

Friday, February 28, 2014

Reproducing Android app deployments (or playing Angry Birds on NixOS)

Some time ago, I did a couple of fun experiments with my Android phone and the Android SDK. Moreover, I have developed a function that can be used to automate Android builds with Nix.

Not so long ago, somebody asked me if it would be possible to run arbitrary Android apps in NixOS. I realised that this was exactly the goal of my fun experiments. Therefore, I think it would be interesting to report about it.

Obtaining Android apps from a device


Besides development versions of apps that can be built with the Android SDK and deployed to a device or emulator through a USB connection, the major source of acquiring Android apps is the Google Playstore.

Although most devices (such as my phone and tablet) bundle the Google Playstore app as part of their software distributions, the system images that come with the Android SDK do not seem to have the Google Playstore app included.

Despite the fact that emulator system images do not have the Google Playstore app installed, we can still get most of the apps we want deployed to an emulator instance. What I typically do is installing an app on my phone with the Google Playstore, then downloading it from my phone and installing it in an emulator instance.

If I attach my phone to the computer and enable USB debugging on my device, I can run the following command to open a shell session:

$ adb -d shell

While navigating through the filesystem, I discovered that my phone stores apps in two locations. The system apps are stored in /system/app. All other apps reside in /data/app. One of the annoying things about the latter folder is that root access to my phone is restricted and I'm not allowed to read the contents of it:

$ cd /data/app
$ ls
opendir failed, Permission denied

Later I discovered that Android distributions use a tool called pm to deploy Android packages. Running the following command-line instruction gives me an overview of all the installed packages and the locations where they reside on the filesystem:

$ pm list packages -f
package:/system/app/GoogleSearchWidget.apk=android.googleSearch.googleSearchWidget
package:/data/app/com.example.my.first.app-1.apk=com.example.my.first.app
package:/system/app/KeyChain.apk=com.android.keychain
package:/data/app/com.appcelerator.kitchensink-1.apk=com.appcelerator.kitchensink
package:/system/app/Shell.apk=com.android.shell
package:/data/app/com.capcom.smurfsandroid-1.apk=com.capcom.smurfsandroid
package:/data/app/com.rovio.angrybirds-2.apk=com.rovio.angrybirds
package:/data/app/com.rovio.BadPiggies-1.apk=com.rovio.BadPiggies
package:/data/app/com.android.chrome-2.apk=com.android.chrome
...

As can be seen, the package manager shows me the location of all installed apps, including those that reside in the folder that I could not inspect. Moreover, downloading the actual APKs files through the Android debugger does not seem to be restricted either. For example, I can run the following Android debugger instruction to obtain Angry Birds that I have installed on my phone:

$ adb -d pull /data/app/com.rovio.angrybirds-2.apk
5688 KB/s (45874906 bytes in 7.875s)

Running arbitrary APKs in the emulator


In my earlier blog posts on automating Android builds with Nix, I have described how I implemented a Nix function (called androidenv.emulateApp { }) that generates scripts spawning emulator instances in which a development app is automatically deployed and started.

I have adapted this function to make it more convenient to deploy existing APKs and to make it more suitable for running apps for other purposes than development:

  • The original script stores the state files of the emulator instance in a temp folder, which gets discarded afterwards. For test automation this is quite useful in most cases. However, we don't want to lose our savegames while playing games. Therefore, I added a parameter called avdHomeDir allowing someone to store the state files in a non-volatile location on the filesystem, such as the user's home directory. If this parameter is not provided, the script remains to use a temp directory.
  • Since we want to keep the state of the emulator instance around, there is also no need to create it every time we launch the emulator. I have adapted the script in such a way that it only creates the AVD if it does not exists. Running the following instruction seems to be sufficient to check whether the AVD exists:

    $ android list avd | grep "Name: device"
    

  • The same thing applies to the app that gets deployed to the emulator instance. It's only supposed to be deployed if it is not installed yet. Running the following command-line instruction did the trick for me:

    $ adb -s emulator-5554 pm list packages | \
        grep package:com.rovio.angrybirds
    package:com.rovio.angrybirds
    

    It shows me the name of the package if it is installed already.

Automatically starting apps in the emulator


As described in my earlier blog post, the script that launches the emulator can also automatically start the app. To do this, we need the Java package identifier of the app and the name of the start activity. While developing apps, these properties can be found in the manifest file that is part of the development repository. However, it's a bit trickier to obtain these attributes if you only have a binary APK.

I have discovered that the aapt tool (that comes with the Android SDK) is quite useful to find what I need. While running the following command-line instruction with the Angry Birds APK, I discovered the following:

$ aapt l -a com.rovio.angrybirds-2.apk
...
A: package="com.rovio.angrybirds" (Raw: "com.rovio.angrybirds")

E: application (line=47)
      A: android:label(0x01010001)="Angry Birds" (Raw: "Angry Birds")
      A: android:icon(0x01010002)=@0x7f020001
      A: android:debuggable(0x0101000f)=(type 0x12)0x0
      A: android:hardwareAccelerated(0x010102d3)=(type 0x12)0x0
      E: activity (line=48)
        A: android:theme(0x01010000)=@0x1030007
        A: android:name(0x01010003)="com.rovio.fusion.App" (Raw: "com.rovio.fusion.App")
        A: android:launchMode(0x0101001d)=(type 0x10)0x2
        A: android:screenOrientation(0x0101001e)=(type 0x10)0x0
        A: android:configChanges(0x0101001f)=(type 0x11)0x4a0
        E: intent-filter (line=49)
          E: action (line=50)
            A: android:name(0x01010003)="android.intent.action.MAIN" (Raw: "android.intent.action.MAIN")
          E: category (line=51)
            A: android:name(0x01010003)="android.intent.category.LAUNCHER" (Raw: "android.intent.category.LAUNCHER")

Somewhere at the end of the output, the package name is shown (com.rovio.angrybirds) and the app's activities. The activity that supports android.intent.action.MAIN intent is actually the one we are looking for. According to the information that we have collected, the start activity that we have to call is named com.rovio.fusion.App.

Writing a Nix expression


Now that we have retrieved the Angry Birds APK and discovered the attributes to automatically start it, we can automate the process that sets up an emulator instance. I wrote the following Nix expression to do this:

with import <nixpkgs> {};

androidenv.emulateApp {
  name = "angrybirds";
  app = ./com.rovio.angrybirds-2.apk;                                                                                                                                              
  platformVersion = "18";                                                                                                                                                          
  useGoogleAPIs = false;
  enableGPU = true;
  abiVersion = "x86";
  
  package = "com.rovio.angrybirds";
  activity = "com.rovio.fusion.App";

  avdHomeDir = "$HOME/.angrybirds";
}

The above Nix expression sets the following parameters:
  • The name parameter is simply used to make the Nix store path better readable.
  • The app parameter points to the Angry Birds APK that I just downloaded from my phone. It gets automatically installed in the spawned emulator instance.
  • platformVersion refers to the API-level of the system image that the emulator runs. API-level 18 corresponds to Android version 4.3
  • If we need Google specific functionality (such as Google Maps) we need a Google API-enabled system image. Angry Birds does not seem to require it.
  • To allow games to run smoothly, it's better to enable hardware GPU emulator/acceleration through the enableGPU parameter
  • The abiVersion sets the CPU architecture of the emulator. Most apps are actually developed for armeabi-v7a and this is usually the safest or the only option that works (unless the app is not using any native code or supports other desired architectures). Angry Birds also supports x86 which can be much faster emulated.
  • The package and activity parameters are used to automatically start the app
  • We use the avdHomeDir parameter to persistently store the state of the emulator in the .angrybirds folder of my home directory, so that the progress is retained.

I can build the earlier Nix expression with the following command:

$ nix-build angrybirds.nix

And then play Angry Birds, by running:

./result/bin/run-test-emulator

The above script starts the emulator, installs Angry Birds, and starts it. This is the result (to rotate the screen I used the 7 and 9 keys on the numpad):


Isn't it awesome? ;)

Transferring state


I also discovered how to transfer the state of apps (such as settings and savegames) from a device to an emulator instance and vice-versa. For some games, you can obtain these through Android's backup functionality. The following instruction makes a backup from my phone of the state of a particular app:

$ adb -d backup com.rovio.angrybirds -f state
Now unlock your device and confirm the backup operation.

When running the above instruction, you'll be asked for confirmation for making the backup and some details to optionally encrypt it.

With the following instruction, I can restore the captured state in the emulator:

$ adb -s emulator-5554 restore com.rovio.angrybirds -f state
Now unlock your device and confirm the backup operation.

While running the latter operation, you'll also be asked for confirmation.

Conclusion


In this blog post, I have described how we can automatically deploy existing Android APKs in an emulator instance using the Nix package manager. I have used it to play Angry Birds and a couple of other Android games in NixOS.

There are few caveats that you have to keep in mind:

  • I have observed that quite a few apps, especially games, have native dependencies. Most of these games only seem to work on ARM-based systems. Although x86 images are much faster to emulate, you will not benefit from the speed boost they may give you if this CPU architecture is not supported.
  • Some apps use Google API specific functionality. Unfortunately, the Android SDK does not provide non-ARM based system images that support them. In a previous blog post, I have developed a Nix expression that can be used to create x86 Google API enabled system images from the ARM-based images, although it may be a bit tricky to set them up.
  • Some apps may install additional files besides the APK when they are installed through the Google Playstore. For me running adb logcat and inspecting the error messages in the logs helped me out a few times.

Availability


The androidenv.emulateApp { } function is part of Nixpkgs.

It's also important to point out that the Nixpkgs repository does NOT contain any prepackaged Android games or apps. You have to obtain and deploy these apps yourself!

Tuesday, January 28, 2014

Building Appcelerator Titanium apps with Nix

Last month, I have been working on quite a lot of things. One of the things I did was improving the Nix function that builds Titanium SDK applications. In fact, it was in Nixpkgs for quite a while already, but I have never written about it on my blog, apart from a brief reference in an earlier blog post about Hydra.

The reason that I have decided to write about this function is because the process of getting Titanium applications deployable with Nix is quite painful (although I have managed to do it) and I want to report about my experiences so that these issues can be hopefully resolved in the future.

Although I have a strong opinion on certain aspects of Titanium, this blog post is not to discuss about the development aspects of the Titanium framework. Instead, the focus is on getting the builds of Titanium apps automated.

What is Titanium SDK?


Titanium is an application framework developed by Appcelerator, which purpose is to enable rapid development of mobile apps for multiple platforms. Currently, Titanium supports iOS, Android, Tizen, Blackberry and mobile web applications.

With Titanium, developers use JavaScript as an implementation language. The JavaScript code is packaged along with the produced app bundles, deployed to an emulator or device and interpreted there. For example, on Android Google's V8 JavaScript runtime is used, and on iOS Apple's JavaScriptCore is used.

Besides using JavaScript code, Titanium also provides an API supporting database access and (fairly) cross platform GUI widgets that have a (sort of) native look on each platform.

Titanium is not a write once run anywhere approach when it comes to cross platform support, but claims that 60-90% of the app code can be reused among platforms.

Finally, the Titanium Studio software distribution is proprietary software, but most of its underlying components (including the Titanium SDK) are free and open-source software available under the Apache Software License. As far as I can see, the Nix function that I wrote does not depend on any proprietary components, besides the Java Development Kit.

Packaging the Titanium CLI


The first thing that needs to be done to automate Titanium builds is being able to build stuff from the command-line. Appcelerator provides a command-line utility (CLI) that is specifically designed for this purpose and is provided as a Node.js package that can be installed through the NPM package manager.

Packaging NPM stuff in Nix is actually quite straight forward and probably the easiest part of getting the builds of Titanium apps automated. Simply adding titanium to the list of node packages (pkgs/top-level/node-packages.json) in Nixpkgs and running npm2nix, a utility developed by Shea Levy that automatically generates Nix expressions for any node package and all their dependencies, did the job for me.

Packaging the Titanium SDK


The next step is packaging the Titanium SDK that contains API libraries, templates and build script plugins for each target platform. The CLI supports multiple SDK versions at the same time and requires at least one version of an SDK installed.

I've obtained an SDK version from Appcelerator's continuous builds page. Since the SDK distributions are ZIP files containing binaries, I have to use the patching/wrapping tricks I have described in a few earlier blog posts again.

The Nix expression I wrote for the SDK basically unzips the 3.2.1 distribution, copies the contents into the Nix store and makes the following changes:

  • The SDK distribution contains a collection of Python scripts that execute build and debugging tasks. However, to be able to run them in NixOS, the shebangs must be changed so that the Python interpreter can be found:

    find . -name \*.py | while read i
    do
        sed -i -e "s|#!/usr/bin/env python|#!${python}/bin/python|" $i
    done
    
  • The SDK contains a subdirectory (mobilesdk/3.2.1.v20140206170116) with a version number and timestamp in it. However, the timestamp is a bit inconvenient, because the Titanium CLI explicitly checks for SDK folders that correspond to a Titanium SDK version number in a Titanium project file (tiapp.xml). Therefore, I strip it out of the directory name to make my life easier:

    $ cd mobilesdk/*
    $ mv 3.2.1.v20140206170116 3.2.1.GA
    
  • The Android builder script (mobilesdk/*/android/builder.py) packages certain files into an APK bundle (which is technically a ZIP file).

    However, the script throws an exception if it encounters files with timestamps below January 1, 1980, which are not supported by the ZIP file format. This is a problem, because Nix automatically resets timestamps of deployed packages to one second after January 1, 1970 (a.k.a. UNIX-time: 1) to make builds more deterministic. To remedy the issue, I had to modify several pieces of the builder script.

    What I basically did to fix this is searching for invocations to ZipFile.write() that adds a file from the filesystem to a zip archive, such as:

    apk_zip.write(os.path.join(lib_source_dir, 'libtiverify.so'), lib_dest_dir + 'libtiverify.so')
    

    I refactored such invocations into a code fragment using a file stream:

    info = zipfile.ZipInfo(lib_dest_dir + 'libtiverify.so')
    info.compress_type = zipfile.ZIP_DEFLATED
    info.create_system = 3
    tf = open(os.path.join(lib_source_dir, 'libtiverify.so'))
    apk_zip.writestr(info, f.read())
    tf.close()
    

    The above code fragment ignores the timestamp of the files to be packaged and uses the current time instead, thus fixing the issue with files that reside in the Nix store.
  • There were two ELF executables (titanium_prep.{linux32,linux64}) in the distribution. To be able to run them under NixOS, I had to patch them so that the dynamic linker can be found:

    $ patchelf --set-interpreter ${stdenv.gcc.libc}/lib/ld-linux-x86-64.so.2 \
        titanium_prep.linux64
    
  • The Android builder script (mobilesdk/*/android/builder.py) requires the sqlite3 python module and the Java Development Kit. Since dependencies do not reside in standard locations in Nix, I had to wrap the builder script to allow it to find them:

    mv builder.py .builder.py
    cat > builder.py <<EOF
    #!${python}/bin/python
        
    import os, sys
        
    os.environ['PYTHONPATH'] = '$(echo ${python.modules.sqlite3}/lib/python*/site-packages)'
    os.environ['JAVA_HOME'] = '${jdk}/lib/openjdk'
        
    os.execv('$(pwd)/.builder.py', sys.argv)
    EOF
    

    Although the Nixpkgs collection has a standard function (wrapProgram) to easily wrap executables, I could not use it, because this function turns any executable into a shell script. The Titanium CLI expects that this builder script is a Python script and will fail if there is a shell code around it.
  • The iOS builder script (mobilesdk/osx/*/iphone/builder.py) invokes ditto to do a recursive copy of a directory hierarchy. However, this executable cannot be found in a Nix builder environment, since the PATH environment variable is set to only the dependencies that are specified. The following command fixes it:

    $ sed -i -e "s|ditto|/usr/bin/ditto|g" \
        $out/mobilesdk/osx/*/iphone/builder.py
    
  • When building IPA files for iOS devices, the Titanium CLI invokes xcodebuild, that in turn invokes the Titanium CLI again. However, it does not seem to propagate all parameters properly, such as the path to the CLI's configuration file. The following modification allows me to set an environment variable called: NIX_TITANIUM_WORKAROUND providing additional parameters to work around it:

    $ sed -i -e "s|--xcode|--xcode '+process.env['NIX_TITANIUM_WORKAROUND']+'|" \
        $out/mobilesdk/osx/*/iphone/cli/commands/_build.js
    

Building Titanium Apps


Besides getting the Titanium CLI and SDK packaged in Nix, we must also be able to build Titanium apps. Apps can be built for various target platforms and come in several variants.

For some unknown reason, the Titanium CLI (in contrast to the old Python build script) forces people to login with their Appcelerator account, before any build task can be executed. However, I discovered that after logging in a file is written into the ~/.titanium folder indicating that the system has logged in. I can simulate logins by creating this file myself:

export HOME=$TMPDIR
    
mkdir -p $HOME/.titanium
cat > $HOME/.titanium/auth_session.json <<EOF
{ "loggedIn": true }
EOF

We also have to tell the Titanium CLI where the Titanium SDK can be found. The following command-line instruction updates the config to provide the path to the SDK that we have just packaged:

$ echo "{}" > $TMPDIR/config.json
$ titanium --config-file $TMPDIR/config.json --no-colors \
    config sdk.defaultInstallLocation ${titaniumsdk}

The Titanium SDK also contains a collection of prebuilt modules, such as one to connect to Facebook. To allow them to be found, I run the following command line instruction to adapt the module search path:

$ titanium --config-file $TMPDIR/config.json --no-colors \
    config paths.modules ${titaniumsdk}

I have also noticed that if the SDK version specified in a Titanium project file (tiapp.xml) does not match the version of the installed SDK, the Titanium CLI halts with an exception. Of course, the version number in a project file can be adapted, but it in my opinion, it's more flexible to just be able to take any version. The following instruction replaces the version inside tiapp.xml into something else:

$ sed -i -e "s|<sdk-version>[0-9a-zA-Z\.]*</sdk-version>|<sdk-version>${tiVersion}</sdk-version>|" tiapp.xml

Building Android apps from Titanium projects


For Android builds, we must tell the Titanium CLI where to find the Android SDK. The following command-line instruction adds its location to the config file:

$ titanium config --config-file $TMPDIR/config.json --no-colors \
    android.sdkPath ${androidsdkComposition}/libexec/android-sdk-*

The variable: androidsdkComposition refers to an Android SDK plugin composition provided by the Android SDK Nix expressions I have developed earlier.

After performing the previous operation, the following command-line instruction can be used to build a debug version of an Android app:

$ titanium build --config-file $TMPDIR/config.json --no-colors --force \
    --platform android --target emulator --build-only --output $out

If the above command succeeds, an APK bundle called app.apk is placed in the Nix store output folder. This bundle contains all the project's JavaScript code and is signed with a developer key.

The following command produces a release version of the APK (meant for submission to the Play Store) in the Nix store output folder, with a given key store, key alias and key store password:

$ titanium build --config-file $TMPDIR/config.json --no-colors --force \
    --platform android --target dist-playstore --keystore ${androidKeyStore} \
    --alias ${androidKeyAlias} --password ${androidKeyStorePassword} \
    --output-dir $out

Before the JavaScript files are packaged along with the APK file, they are first passed through Google's Closure Compiler, which performs some static checking, removes dead code, and minifies all the source files.

Building iOS apps from Titanium projects


Apart from Android, we can also build iOS apps from Titanium projects.

I have discovered that while building for iOS, the Titanium CLI invokes xcodebuild which in turn invokes the Titanium CLI again. However, it does not propagate the --config-file parameter, causing it to fail. The earlier hack that I made in the SDK expression with the environment variable can be used to circumvent this:

export NIX_TITANIUM_WORKAROUND="--config-file $TMPDIR/config.json"

After applying the workaround, building an app for the iPhone simulator is straight forward:

$ cp -av * $out
$ cd $out
            
$ titanium build --config-file $TMPDIR/config.json --force --no-colors \
    --platform ios --target simulator --build-only \
    --device-family universal --output-dir $out

After running the above command, the simulator executable is placed into the output Nix store folder. It turns out that the JavaScript files of the project folder are symlinked into the folder of the executable. However, after the build has completed these symlink references will become invalid, because the temp folder has been deleted. To allow the app to find these JavaScript files, I simply copy them along with the executable into the Nix store.

Finally, the most complicated task is producing IPA bundles to deploy an app to a device for testing or to the App Store for distribution.

Like native iOS apps, they must be signed with a certificate and mobile provisioning profile. I used the same trick described in an earlier blog post on building iOS apps with Nix to generate a temporary keychain in the user's home directory for this:

export HOME=/Users/$(whoami)
export keychainName=$(basename $out)
            
security create-keychain -p "" $keychainName
security default-keychain -s $keychainName
security unlock-keychain -p "" $keychainName
security import ${iosCertificate} -k $keychainName -P "${iosCertificatePassword}" -A

provisioningId=$(grep UUID -A1 -a ${iosMobileProvisioningProfile} | grep -o "[-A-Z0-9]\{36\}")
        
if [ ! -f "$HOME/Library/MobileDevice/Provisioning Profiles/$provisioningId.mobileprovision" ]
then
    mkdir -p "$HOME/Library/MobileDevice/Provisioning Profiles"
    cp ${iosMobileProvisioningProfile} \
        "$HOME/Library/MobileDevice/Provisioning Profiles/$provisioningId.mobileprovision"
fi

I also discovered that builds fail, because some file (the facebook module) from the SDK cannot be read (Nix makes deployed package read-only). I circumvented this issue by making a copy of the SDK in my temp folder, fixing the file permissions, and configure the Titanium CLI to use the copied SDK instance:

cp -av ${titaniumsdk} $TMPDIR/titaniumsdk
            
find $TMPDIR/titaniumsdk | while read i
do
    chmod 755 "$i"
done

titanium --config-file $TMPDIR/config.json --no-colors \
    config sdk.defaultInstallLocation $TMPDIR/titaniumsdk

Because I cannot use the temp folder as a home directory, I also have to simulate a login again:

$ mkdir -p $HOME/.titanium
$ cat > $HOME/.titanium/auth_session.json <<EOF
{ "loggedIn": true }
EOF

Finally, I can build an IPA by running:

$ titanium build --config-file $TMPDIR/config.json --force --no-colors \
    --platform ios --target dist-adhoc --pp-uuid $provisioningId \
    --distribution-name "${iosCertificateName}" \
    --keychain $HOME/Library/Keychains/$keychainName \
    --device-family universal --output-dir $out

The above command-line invocation minifies the JavaScript code, builds an IPA file with a given certificate, mobile provisioning profile and authentication credentials, and puts the result in the Nix store.

Example: KitchenSink


I have encapsulated all the builds commands shown in the previous section into a Nix function called: titaniumenv.buildApp {}. To test the usefulness of this function, I took KitchenSink, an example app provided by Appcelerator, to show Titanium's abilities. The App can be deployed to all target platforms that the SDK supports.

To package KitchenSink, I wrote the following expression:

{ titaniumenv, fetchgit
, target, androidPlatformVersions ? [ "11" ], release ? false
}:

titaniumenv.buildApp {
  name = "KitchenSink-${target}-${if release then "release" else "debug"}";
  src = fetchgit {
    url = https://github.com/appcelerator/KitchenSink.git;
    rev = "d9f39950c0137a1dd67c925ef9e8046a9f0644ff";
    sha256 = "0aj42ac262hw9n9blzhfibg61kkbp3wky69rp2yhd11vwjlcq1qc";
  };
  tiVersion = "3.2.1.GA";
  
  inherit target androidPlatformVersions release;
  
  androidKeyStore = ./keystore;
  androidKeyAlias = "myfirstapp";
  androidKeyStorePassword = "mykeystore";
}

The above function fetches the KitchenSink example from GitHub and builds it for a given target, such as iphone or android, and supports building a debug version for an emulator/simulator, or a release version for a device or for the Play store/App store.

By invoking the above function as follows, a debug version of the app for Android is produced:

import ./kitchensink {
  inherit (pkgs) fetchgit titaniumenv;
  target = "android";
  release = false;
}
The following function invocation produces an iOS executable that can be run in the iPhone simulator:

import ./kitchensink {
  inherit (pkgs) fetchgit titaniumenv;
  target = "iphone";
  release = false;
}

As may be observed, building KitchenSink through Nix is a straight forward process for most targets. However, the target producing an IPA version of KitchenSink that we can deploy to a real device is a bit complicated to use, because of some restrictions made by Apple.

Since all apps that are deployed to a real device have to be signed and the mobile provisioning profile should match the app's app id, this is sort of problem. Luckily, I can also do a comparable renaming trick as I have described earlier with in a blog post about improving the testability of iOS apps. Simply executing the following commands in the KitchenSink folder were sufficient:

sed -i -e "s|com.appcelerator.kitchensink|${newBundleId}|" tiapp.xml
sed -i -e "s|com.appcelerator.kitchensink|${newBundleId}|" manifest

The above commands change the com.appcelerator.kitchensink app id into any other specified string. If this app id is changed to the corresponding id in a mobile provisioning profile, then you should be able to deploy KitchenSink to a real device.

I have added the above renaming procedure to the KitchenSink expression. The following example invocation to the earlier Nix function, shows how we can rename the app's id to: com.example.kitchensink and how to use a certificate and mobile provisioning profile for an exisiting app:

import ./kitchensink {
  inherit (pkgs) stdenv fetchgit titaniumenv;
  target = "iphone";
  release = true;
  rename = true;
  newBundleId = "com.example.kitchensink";
  iosMobileProvisioningProfile = ./profile.mobileprovision;
  iosCertificate = ./certificate.p12;
  iosCertificateName = "Cool Company";
  iosCertificatePassword = "secret";
}


By using the above expressions KitchenSink can be built for both Android and iOS. The left picture above shows what it looks like on iOS, the right picture shows what it looks like on Android.

Discussion


With the Titanium build function described in this blog post, I can automatically build Titanium apps for both iOS and Android using the Nix package manager, although it was quite painful to get it done and tedious to maintain.

What bothers me the most about this process is the fact that Appcelerator has crafted their own custom build tool with lots of complexity (in terms of code size), flaws (e.g. not propagating the CLI's argument properly from xcodebuild) and weird issues (e.g. an odd way of detecting the presence of the JDK, and invoking the highly complicated legacy python scripts), while there are already many more mature build solutions available that can do the same job.

A quick inspection of Titanium CLI's git repository shows me that it consists of 8174 lines of code. However, not all of their build stuff is there. Some common stuff, such as the JDK and Android detection stuff, resides in the node-appc project. Moreover, the build steps are performed by plugin scripts that are distributed with the SDK.

A minor annoyance is that the new Node.js based Titanium CLI requires Oracle's Java Development Kit to make Android builds work, while the old Python based build script worked fine with OpenJDK. I have no idea yet how to fix this. Since we cannot provide a Nix expression that automatically downloads Oracle's JDK that automatically (due to license restrictions), Nix users are forced to manually download and import it into the Nix store first, before any of the Titanium stuff can be built.

So how did I manage to figure all this mess out?

Besides knowing that I have to patch executables, fix shebangs and wrap certain executables, the strace command on Linux helps me out a lot (since it shows me things like files that can not be opened) as well as the fact that Python and Node.js show me error traces with line numbers when something goes wrong so that I can debug easily what's going on.

However, since I also have to do builds on Mac OS X for iOS devices, I observed that there is no strace making ease my pain on that particular operating system. However, I discovered that there is a similar tool called: dtruss, that provides me similar data regarding system calls.

There is one minor annoyance with dtruss -- it requires super-user privileges to work. Fortunately, thanks to this MacWorld article, I can fix this by setting the setuid bit on the dtrace executable:

$ sudo chmod u+s /usr/sbin/dtrace

Now I can conveniently use dtruss in unprivileged build environments on Mac OS X to investigate what's going on.

Availability


The Titanium build environment as well as the KitchenSink example are part of Nixpkgs.

The top-level expression for KitchenSink example as well as the build operations described earlier is located in pkgs/development/mobile/titaniumenv/examples/default.nix. To build a debug version of KitchenSink for Android, you can run:

$ nix-build -A kitchensink_android_debug

The release version can be built by running:

$ nix-build -A kitchensink_android_release

The iPhone simulator version can be built by running:

$ nix-build -A kitchensink_ios_development

Building an IPA is slightly more complicated. You have to provide a certificate and mobile provisioning profile, and some renaming trick settings as parameters to make it work (which should of course match to what's inside the mobile provisioning profile that is actually used):

$ nix-build --arg rename true \
    --argstr newBundleId com.example.kitchensink \
    --arg iosMobileProvisionProfile ./profile.mobileprovision \
    --arg iosCertificate ./certificate.p12 \
    --argstr iosCertificateName "Cool Company" \
    --argstr iosCertificatePassword secret \
    -A kitchensink_ipa

There are also a couple of emulator jobs to easily spawn an Android emulator or iPhone simulator instance.

Currently, iOS and Android are the only target platforms supported. I did not investigate Blackberry, Tizen or Mobile web applications.