Thursday, April 30, 2015

An evaluation and comparison of Snappy Ubuntu

A few months ago, I noticed that somebody was referring to my "On Nix and GNU Guix" blog post from the Ask Ubuntu forum. The person who started the topic wanted to know how Snappy Ubuntu compares to Nix and GNU Guix.

Unfortunately, he did not read my blog post (or possibly one of the three Nix and NixOS explanation recipes) in detail.

Moreover, I was hoping that somebody else involved with Snappy Ubuntu would do a more in depth comparison and write a response, but this still has not happened yet. As a matter of fact, there is still no answer as of today.

Because of these reasons, I have decided to take a look at Snappy Ubuntu Core and do an evaluation myself.

What is Snappy Ubuntu?

Snappy is Ubuntu's new mechanism for delivering applications and system upgrades. It is used as the basis of their upcoming cloud and mobile distributions and supposed to be offered alongside the Debian package manager that Ubuntu currently uses for installing and upgrading software in their next generation desktop distribution.

Besides the ability to deploy packages, Snappy also has a number of interesting non-functional properties. For example, the website says the following:

The snappy approach is faster, more reliable, and lets us provide stronger security guarantees for apps and users -- that's why we call them "snappy" applications.

Snappy apps and Ubuntu Core itself can be upgraded atomically and rolled back if needed -- a bulletproof approach that is perfect for deployments where predictability and reliability are paramount. It's called "transactional" or "image-based" systems management, and we’re delighted to make it available on every Ubuntu certified cloud.

The text listed above contains a number of interesting quality aspects that have a significant overlap with Nix -- reliability, atomic upgrades and rollbacks, predictability, and being "transactional" are features that Nix also implements.

Package organization

The Snappy Ubuntu Core distribution uses mostly a FHS compliant filesystem layout. One notable deviation is the folder in which applications are installed.

For application deployment the /app folder is used in which files belonging to a specific application version reside in separate folders. Application folders use the following naming convention:


Each application is identified by its name, version identifier and optionally a developer identifier, as shown below:

$ ls -l /apps
drwxr-xr-x 2 root ubuntu 4096 Apr 25 20:38 bin
drwxr-xr-x 3 root root   4096 Apr 25 15:56 docker
drwxr-xr-x 3 root root   4096 Apr 25 20:34 go-example-webserver.canonical
drwxr-xr-x 3 root root   4096 Apr 25 20:31 hello-world.canonical
drwxr-xr-x 3 root root   4096 Apr 25 20:38 webcam-demo.canonical
drwxr-xr-x 3 root ubuntu 4096 Apr 23 05:24 webdm.sideload

For example, the /app/webcam-demo.canonical/1.0.1 refers to a package named: webcam version 1.0.1 that is delivered by Canonical.

There are almost no requirements on the contents of an application folder. I have observed that the example packages seem to follow some conventions though. For example:

$ cd /apps/webcam-demo.canonical/1.0.1
$ find . -type f ! -iname ".*"

Binaries are typically stored inside the bin/ sub folder, while libraries are stored inside the lib/ sub folder. Moreover, the above example also ships binaries for two kinds of system architectures (x86_64 and ARM) that reside inside bin/x86_64-linux, bin/arm-linux-gnueabihf, and lib/x86_64-linux, arm-linux-gnueabihf sub folders.

The only sub folder that has a specific purpose is meta/ that is supposed to contain at least two files -- the file contains documentation in which the first heading and the first paragraph have a specific meaning, and the package.yaml file contains various meta attributes related to the deployment of the package.

Snappy's package storing convention also makes it possible to store multiple versions of a package next to each other, as shown below:

$ ls -l /apps/webcam-demo.canonical
drwxr-xr-x 7 clickpkg clickpkg 4096 Apr 24 19:38 1.0.0
drwxr-xr-x 7 clickpkg clickpkg 4096 Apr 25 20:38 1.0.1
lrwxrwxrwx 1 root     root        5 Apr 25 20:38 current -> 1.0.1

Moreover, every application folder contains a symlink named: current/ that refers to the version that is currently in use. This approach makes it possible to do atomic upgrades and rollbacks by merely flipping the target of the current/ symlink. As a result, the system always refers to an old or new version of the package, but never to an inconsistent mix of the two.

Apart from storing packages in isolation, they also must be made accessible to end users. For each binary that is declared in the package.yaml file, e.g.:

# for debugging convience we also make the binary available as a command
 - name: bin/webcam-webui

a wrapper script is placed inside /apps/bin that is globally accessible by the users of a system through the PATH environment variable.

Each wrapper script contains the app name. For example, the webcam-webui binary (shown earlier) must be started as follows:


Besides binaries, package configurations can also declare services from which systemd jobs are composed. The corresponding configuration files are put into the /etc/systemd/system folder and also use a naming convention containing the package name.

Unprivileged users can also install their own packages. The corresponding application files are placed inside $HOME/app and are organized in the same way as the global /app folder.

Snappy's package organization has many similarities with Nix's package organization -- Nix also stores files files belonging to a package in isolated folders in a special purpose directory called the Nix store.

However, Nix uses a more powerful way of identifying packages. Whereas Snappy only identifies packages with their names, version numbers and vendor identifiers, Nix package names are prefixed with unique hash codes (such as /nix/store/wan65mpbvx2a04s2s5cv60ci600mw6ng-firefox-with-plugins-27.0.1) that are derived from all build time dependencies involved to build the package, such as compilers, libraries and the build scripts themselves.

The purpose of using hash codes is to make a distinction between any variant of the same package. For example, when a package is compiled with a different version of GCC, linked against a different library dependency, when debugging symbols are enabled or disabled or certain optional features enabled or disabled, or the build procedure has been modified, a package with a different hash is built that is safely stored next to existing variants.

Moreover, Nix also uses symlinking to refer to specific versions of packages, but the corresponding mechanism is more powerful. Nix generates so-called Nix profiles which synthesize the contents of a collection of installed packages in the Nix store into a symlink tree so that their files (such as executables) can be referenced from a single location. A second symlink indirection refers to Nix profile containing the desired versions of the packages.

Nix profiles also allow unprivileged users to manage their own set of private packages that do not conflict with other user's private packages or the system wide installed packages. However, partly because of Nix's package naming convention, also the packages of unprivileged users can be safely stored in the global Nix store, so that common dependencies can be shared among users in a secure way.

Dependency management

Software packages are rarely self contained -- they typically have dependencies on other packages, such as shared libraries. If a dependency is missing or incorrect, a package may not work properly or not at all.

I observed that in the Snappy example packages, all dependencies are bundled statically. For example, in the webcam-demo package, the lib/ sub folder contains the following files:


As can be seen in the above output, all the library dependencies, including the libraries' dependencies (even libc) are bundled into the package. When running an executable or starting a systemd job, a container (essentially an isolated/restricted environment) is composed in which the process runs (with some restrictions) where it can find its dependencies in the "common FHS locations", such as /lib.

Besides static bundling, there seems to be a primitive mechanism that provides some form of sharing. According to the packaging format specification, it also possible to declare dependencies on frameworks in the package.yaml file:

frameworks: docker, foo, bar # list of frameworks required

Frameworks are managed like ordinary packages in /app, but they specify additional required system privileges and require approval from Canonical to allow them to be redistributed.

Although it is not fully clear to me from the documentation how these dependencies are addressed, I suspect that the contents of the frameworks is made available to packages inside the containers in which they run.

Moreover, I noticed that dependencies are only addressed by their names and that they refer to the current versions of the corresponding frameworks. In the documentation, there seems to be no way (yet) to refer to other versions or variants of frameworks.

The Nix-way of managing dependencies is quite different -- Nix packages are constructed from source and the corresponding build procedures are executed in isolated environments in which only the specified build-time dependencies can be found.

Moreover, when constructing Nix packages, runtime dependencies are bound statically to executables, for example by modifying the RPATH of an ELF executable or wrapping executables in scripts that set environment variables allowing it to find its dependencies (such as CLASSPATH or PERL5LIB). A subset of the buildtime dependencies are identified by Nix as runtime dependencies by scanning for hash occurrences in the build result.

Because dependencies are statically bound to executables, there is no need to compose containers to allow executables to find them. Furthermore, they can also refer to different versions or variants of library dependencies of a package without conflicting with other package's dependencies. Sharing is also supported, because two packages can refer to the same dependency with the same hash prefix in the Nix store.

As a sidenote: with Nix you can also use a containerized approach by composing isolated environments (e.g. a chroot environment or container) in which packages can find their dependencies from common locations. A prominent Nix package that uses this approach is Steam, because it is basically a deployment tool conflicting with Nix's deployment properties. Although such an approach is also possible, it is only used in very exceptional cases.

System organization

Besides applications and frameworks, the base system of the Snappy Ubuntu Core distribution can also be upgraded and downgraded. However, a different mechanism is used to accomplish this.

According to the filesystem layout & updates guide, the Snappy Ubuntu Core distribution follows a specific partition layout:

  • boot partition. This is a very tiny partition used for booting and should be big enough to contain a few kernels.
  • system-a partition. This partition contains a minimal working base system. This partition is mounted read-only.
  • system-b partition. An alternative partition containing a minimal working base system. This partition is mounted read-only as well.
  • writable partition. A writable partition that stores everything else including the applications and frameworks.

Snappy uses an "A/B system partitions mechanism" to allow a base system to be updated as a single unit by applying a new system image. It is also used to roll back to the "other" base system in case of problems with the most recently-installed system by making the bootloader switch root filesystems.

NixOS (the Linux distribution built around the Nix package manager) approaches system-level upgrades in a different way and is much more powerful. In NixOS, a complete system configuration is composed from packages residing in isolation in the Nix store (like ordinary packages) and these are safely stored next to existing versions. As a result, it is possible to roll back to any previous system configuration that has not been garbage collected yet.

Creating packages

According to the packaging guide, creating Snap files is very simple. It is just creating a directory, putting some files in there, creating a meta/ sub folder with a and package.yaml file, and running:

$ snappy build .

The above file generates a Snap file, which is basically just a tarball file containing the contents of the folder.

In my opinion, creating Snap packages is not that easy -- the above process demonstrates that delivering files from one machine to another is straight forward, but getting a package right is another thing.

Many packages on Linux systems are constructed from source code. To properly do that, you need to have the required development tools and libraries deployed first, a process that is typically easier said than done.

Snappy does not provide facilities to make that process manageable. With Snappy, it is the packager's own responsibility.

In contrast, Nix is a source package manager and provides a DSL that somebody can use construct isolated environments in which builds are executed and automatically deploys all buildtime dependencies that are required to build a package.

The build facilities of Nix are quite accessible. For example, you can easily construct your own private set of Nix packages or a shell session containing all development dependencies.

Moreover, Nix also implements transparent binary deployment -- if a particular Nix package with an identical hash exists elsewhere, we can download it from a remote location instead of building it from source ourselves.


Another thing the Snappy Ubuntu Core distribution does with containers (besides using them to let a package find its dependencies) is restricting the things programs are allowed to do, such as the TCP/UDP ports they are allowed to bind to.

In Nix and NixOS, it is not a common practice to restrict the runtime behaviour of programs by default. However, it is still possible to impose restrictions on running programs, by composing a systemd job for a program yourself in a system's NixOS configuration.


The following table summarizes the conceptual differences between the Snappy Ubuntu Core and Nix/NixOS covered in this blog post:

Snappy Ubuntu Core Nix/NixOS
Concept Binary package manager Source package manager (with transparent binary deployment)
Dependency addressing By name Exact (using hash codes)
Dependency binding Container composition Static binding (e.g. by modifying RPATH or wrapping executables)
Systems composition management "A/B" partitions Package compositions
Construction from source Unmanaged Managed
Unprivileged user installations Supported without sharing Supported with sharing
Runtime isolation Part of package configuration Supported optionally, by manually composing a systemd job


Snappy shares some interesting traits with Nix that provide a number of huge deployment benefits -- by deviating from the FHS and storing packages in isolated folders, it becomes easier to store multiple versions of packages next to each other and to perform atomic upgrades and rollbacks.

However, something that I consider a huge drawback of Snappy is the way dependencies are managed. In the Snappy examples, all library dependencies are bundled statically consuming more disk space (and RAM at runtime) than needed.

Moreover, packaging shared dependencies as frameworks is not very convenient and require approval from Canonical if they must be distributed. As a consequence, I think it will not be very encouraging to modularize systems, which is generally considered a good practice.

According to the framework guide the purpose of frameworks is to extend the base system, but not to be a sharing mechanism. Also the guide says:

Frameworks exist primarily to provide mediation of shared resources (eg, device files, sensors, cameras, etc)

So it appears that sharing in general is discouraged. In many common Linux distributions (including Debian and derivatives such as Ubuntu), it is common that the reuse-degree is raised to almost a maximum. For example, each library is packaged individually and sometimes libraries are even split into binary, development and documentation sub packages. I am not sure how Snappy is going to cope with such a fine granularity of reuse. Is Snappy going to be improved to support reuse as well, or is it now considered a good thing to package huge monolithic blobs?

Also, Snappy does only binary deployment and is not really helpful to alleviate the problem of constructing packages from source which is also quite a challenge in my opinion. I see lots of room for improvement in this area as well.

Another funny observation is the fact that Snappy Ubuntu Core relies on advanced concepts such as containers to make programs work, while there are also simpler solutions available, such as static linking.

Finally, something that Nix/NixOS could learn from the Snappy approach is the runtime isolation of programs out-of-the-box. Currently, doing this with Nix/NixOS this is not as convenient as with Snappy.


This is not the only comparison I have done between Nix/NixOS and another deployment approach. A few years ago while I was still a PhD student, I also did a comparison between the deployment properties of GoboLinux and NixOS.

Interestingly enough, GoboLinux addresses packages in a similar way as Snappy, supports sharing, does not provide runtime isolation of programs, but does have a very powerful source construction mechanism that Snappy lacks.

Friday, March 13, 2015

Disnix 0.3 release announcement

In the previous blog post, I have explained the differences between Disnix (which does service deployment) and NixOps (which does infrastructure deployment), and shown that both tools can be used together to address both concerns in a deployment process.

Furthermore, I raised a couple of questions and intentionally left one unmentioned question unanswered. The unmentioned question that I was referring to is the following: "Is Disnix still alive or is it dead?".

The answer is that Disnix's development was progressing at a very low pace for some time after I left academia -- I made minor changes once in a while, but nothing really interesting happened.

However, for the last few months, I am using it on a daily basis and made many big improvements. Moreover, I have reached a stable point and decided that this is a good moment to announce the next release!

New features

So what is new in this release?

Visualization tool

I have added a new tool to the Disnix toolset named: disnix-visualize that generates Graphviz images to visualize a particular deployment scenario. An example image is shown below:

The above picture shows a deployment scenario of the StaffTracker Java example in which services are divided over two machines in a network and have all kinds of complex dependency relationships as denoted by the arrows.

The tool was already included in the development versions for a while, but has never been part of any release.


I also made a major change in Disnix's architecture. As explained in my old blog post about Disnix, activating and deactivating services cannot be done generically and I have developed a plugin system to take care of that.

This plugin system package (formerly known as Disnix activation scripts) has now become an independent tool named Dysnomia and can also be applied in different contexts. Furthermore, it can also be used as a standalone tool.

For example, a running MySQL DBMS instance (called a container in Dysnomia terminology) could be specified in a configuration file (such as ~/mysql-production) as follows:


A database can be encoded as an SQL file (~/test-database/createdb.sql) creating the schema:

create table author
  FirstName  VARCHAR(255)  NOT NULL,
  LastName   VARCHAR(255)  NOT NULL,

create table books
( ISBN       VARCHAR(255)  NOT NULL,
  Title      VARCHAR(255)  NOT NULL,
  FOREIGN KEY(AUTHOR_ID) references author(AUTHOR_ID)
    on update cascade on delete cascade

We can use the following command-line instruction to let Dysnomia deploy the database to the MySQL DBMS container we have just specified earlier:

$ dysnomia --operation activate --component ~/test-database \
  --container ~/mysql-production

When Disnix has to execute deployment operations, two external tools are consulted -- Nix takes care of all deployment operations of the static parts of a system, and Dysnomia takes care of performing the dynamic activation and deactivation steps.

Concurrent closure transfers

In the previous versions of Disnix, only one closure (of a collection of services and its intra-dependencies) is transferred to a target machine at the time. If a target machine has more network bandwidth than the coordinator, this is usually fine, but in all other cases, it slows the deployment process down.

In the new version, two closures are transferred concurrently by default. The amount of concurrent closure transfers can be adjusted as follows:

$ disnix-env -s services.nix -i infrastructure.nix \
  -d distribution.nix --max-concurrent-transfers 4

The last command-line argument states that 4 closures should be transferred concurrently.

Concurrent service activation and deactivation

In the old Disnix versions, the activation and deactivation steps of a service on a target machine were executed sequentially, i.e. one service on a machine at the time. In all my old testcases these steps were quite cheap/quick, but now that I have encountered systems that are much bigger, I noticed that there is a lot of deployment time that we can save.

In the new implementation, Disnix tries to concurrently activate or deactivate one service per machine. The amount of services that can be concurrently activated or deactivated per machine can be raised in the infrastructure model:

  test1 = {
    hostname = "test1";
    numOfCores = 2;

In the above infrastructure model, the numOfCores attribute states that two services can be concurrently activated/deactivated on machine test1. If this attribute has been omitted, it defaults to 1.

Multi connection protocol support

By default, Disnix uses an SSH protocol wrapper to connect to the target machines. There is also an extension available, called DisnixWebService, that uses SOAP + MTOM instead.

In the old version, changing the connection protocol means that every target machine should be reached with it. In the new version, you can also specify the target property and client interface in the infrastructure model to support multi connection protocol deployments:

  test1 = {
    hostname = "test1";
    targetProperty = "hostname";
    clientInterface = "disnix-ssh-client";
  test2 = {
    hostname = "test2";
    targetEPR = http://test2:8080/DisnixWebService/services/DisnixWebService;
    targetProperty = "targetEPR";
    clientInterface = "disnix-soap-client";

The above infrastructure model states the following:

  • To connect to machine: test1, the hostname attribute contains the address and the disnix-ssh-client tool should be invoked to connect it.
  • To connect to machine: test2, the targetEPR attribute contains the address and the disnix-soap-client tool should be invoked to connect it.

NixOps integration

As described in my previous blog post, Disnix does service deployment and can integrate NixOS' infrastructure deployment features with an extension called DisnixOS.

DisnixOS can now also be used in conjunction with NixOps -- NixOps can be used to instantiate and deploy a network of virtual machines:

$ nixops create ./network.nix ./network-ec2.nix -d ec2
$ nixops deploy -d ec2

and DisnixOS can be used to deploy services to them:

$ disnixos-env -s services.nix -n network.nix -d distribution.nix \

Omitted features

There are also a couple of features described in some older blog posts, papers, and my PhD thesis, which have not become part of the new Disnix release.

Dynamic Disnix

This is an extended framework built on top of Disnix supporting self-adaptive redeployment. Although I promised to make it part of the new release a long time ago, it did not happen. However, I did update the prototype to work with the current Disnix implementation, but it still needs refinements, documentation and other small things to make it usable.

Brave people who are eager to try it can pull the Dynamic Disnix repository from my GitHub page.

Snapshot/restore features of Dysnomia

In a paper I wrote about Dysnomia I also proposed state snapshotting/restoring facilities. These have not become part of the released versions of Dysnomia and Disnix yet.

The approach I have described is useful in some scenarios but also has a couple of very big drawbacks. Moreover, it also significantly alters the behavior of Disnix. I need to find a way to properly integrate these features in such a way that they do not break the standard approach. Moreover, these techniques must be selectively applied as well.


In this blog post, I have announced the availability of the next release of Disnix. Perhaps I should give it the codename: "Disnix Forever!" or something :-). Also, the release date (Friday the 13th) seems to be appropriate.

Moreover, the previous release was considered an advanced prototype. Although I am using Disnix on a daily basis now to eat my own dogfood, and the toolset has become much more usable, I would not yet classify this release as something that is very mature yet.

Disnix can be obtained by installing NixOS, through Nixpkgs or from the Disnix release page.

I have also updated the Disnix homepage a bit, which should provide you more information.

Monday, March 9, 2015

On NixOps, Disnix, service deployment and infrastructure deployment

I have written many software deployment related blog posts covering tools that are part of the Nix project. However, there is one tool that I have not elaborated about so far, namely: NixOps, which has become quite popular these days.

Although NixOps is quite popular, its availability also leads to a bit of confusion with another tool. Some Nix users, in particular newbies, are suffering from this. The purpose of this blog post is to clear up that confusion.


NixOps is something that is advertised as the "NixOS-based cloud deployment tool". It basically expands NixOS' (a Linux distribution built around the Nix package manager) approach of deploying a complete system configuration from a declarative specification to networks of machines and instantiates and provisions the required machines (e.g. in an IaaS cloud environment, such as Amazon EC2) automatically if requested.

A NixOps deployment process is driven by one or more network models that encapsulate multiple (partial) NixOS configurations. In a standard NixOps workflow, network models are typically split into a logical network model capturing settings that are machine independent and a physical network model capturing machine specific properties.

For example, the following code fragment is a logical network model consisting of three machines capturing the configuration properties of a Trac deployment, a web-based issue tracker system:

  network.description = "Trac deployment";

  storage =
    { pkgs, ... }:

    { services.nfs.server.enable = true;
      services.nfs.server.exports = ''
      services.nfs.server.createMountPoints = true;

  postgresql =
    { pkgs, ... }:

    { services.postgresql.enable = true;
      services.postgresql.package = pkgs.postgresql;
      services.postgresql.enableTCPIP = true;
      services.postgresql.authentication = ''
        local all all                trust
        host  all all   trust
        host  all all ::1/128        trust
        host  all all trust

  webserver =
    { pkgs, ... }:

    { fileSystems = [
        { mountPoint = "/repos";
          device = "storage:/repos";
          fsType = "nfs";
      services.httpd.enable = true;
      services.httpd.adminAddr = "root@localhost";
      services.httpd.extraSubservices = [
        { serviceType = "trac"; }
      environment.systemPackages = [

The three machines in the above example have the following purpose:

  • The first machine, named storage, is responsible for storing Subversion source code repositories in the following folder: /repos and makes the corresponding folder available as a NFS mount.
  • The second machine, named postgresql, runs a PostgreSQL database server storing the tickets.
  • The third machine, named webserver, runs the Apache HTTP server hosting the Trac web application front-end. Moreover, it mounts the /repos folder as a network file system connecting to the storage machine so that the Trac web application can view the Subversion repositories stored inside it.

The above specification can be considered a logical network model, because it captures the configuration we want to deploy, without any machine specific characteristics. Regardless of what kind of machine we intend deploy, we want these services to be available.

However, a NixOS configuration cannot be deployed without any machine specific settings. These remaining settings can be specified by writing a second model, the physical network model, capturing these:

  storage = 
    { pkgs, ...}:
    { boot.loader.grub.version = 2;
      boot.loader.grub.device = "/dev/sda";

      fileSystems = [
        { mountPoint = "/";
          label = "root";

      swapDevices = [
        { label = "swap"; }

      networking.hostName = "storage";

  postgresql = ...

  webserver = ...

The above partial network model specifies the following physical characteristics for the storage machine:

  • GRUB version 2 should be used as bootloader and should installed on the MBR of the hard drive partition: /dev/sda.
  • The hard drive partition with label: root should be mounted as root partition.
  • The hard drive partition with label: swap should be mounted as swap partition.
  • The hostname of the system should be: 'storage'

By invoking NixOps with the two network models shown earlier as parameters, we can create a NixOps deployment -- an environment containing a set of machines that belong together:

$ nixops create ./network-logical.nix ./network-physical.nix -d test

The above command creates a deployment named: test. We can run the following command to actually deploy the system configurations:

$ nixops deploy -d test

What the above command does is invoking the Nix package manager to build all the machine configurations, then it transfers their corresponding package closures to the target machines and finally activates the NixOS configurations. The end result is a collection of machines running the new configuration, if all previous steps have succeeded.

If we adapt any of the network models, and run the deploy command again, the system is upgraded. In case of an upgrade, only the packages that have been changed are built and transferred, making this phase as efficient as possible.

We can also replace the physical network model shown earlier with the following model:

  storage = {
    deployment.targetEnv = "virtualbox";
    deployment.virtualbox.memorySize = 1024;

  postgresql = ...

  webserver = ...

The above physical network configuration states that the storage machine is a VirtualBox Virtual Machine (VM) requiring 1024 MiB of RAM.

When we instantiate a new deployment with the above physical network model and deploy it:

$ nixops create ./network-logical.nix ./network-vbox.nix -d vbox
$ nixops deploy -d vbox

NixOps does an extra step before doing the actual deployment of the system configurations -- it first instantiates the VMs by consulting VirtualBox and populates them with a basic NixOS disk image.

Similarly, we can also create a physical network model like this:

  region = "us-east-1";
  accessKeyId = "ABCD..."; # symbolic name looked up in ~/.ec2-keys

  ec2 =
    { resources, ... }:
    { deployment.targetEnv = "ec2";
      deployment.ec2.accessKeyId = accessKeyId;
      deployment.ec2.region = region;
      deployment.ec2.instanceType = "m1.medium";
      deployment.ec2.keyPair =;
      deployment.ec2.securityGroups = [ "my-security-group" ];
  storage = ec2;

  postgresql = ec2;

  webserver = ec2; = {
    inherit region accessKeyId;

The above physical network configuration states that the storage machine is a virtual machine residing in the Amazon EC2 cloud.

Running the following commands:

$ nixops create ./network-logical.nix ./network-ec2.nix -d ec2
$ nixops deploy -d ec2

Automatically instantiate the virtual machines in EC2, populates them with basic NixOS AMI images and finally deploys the machines to run our desired Trac deployment.

(In order to make EC2 deployment work, you need to create the security group (e.g. my-security-group) through the Amazon EC2 console first and you must set the AWS_SECRET_ACCESS_KEY environment variable to contain the secret access key that you actually need to connect to the Amazon services).

Besides physical machines, VirtualBox, and Amazon EC2, NixOps also supports the Google Computing Engine (GCE) and Hetzner. Moreover, preliminary Azure support is also available in the development version of NixOps.

With NixOps you can also do multi-cloud deployment -- it is not required to deploy all VMs in the same IaaS environment. For example, you could also deploy the first machine to Amazon EC2, the second to Hetzner and the third to a physical machine.

In addition to deploying system configurations, NixOps can be used to perform many other kinds of system administration tasks that work on machine level.


Readers who happen to know me a bit, may probably notice that many NixOps features are quite similar to things I did in the past -- while I was working for Delft University of Technology as a PhD student, I was investigating distributed software deployment techniques and developed a tool named: Disnix that also performs distributed deployment tasks using the Nix package manager as underlying technology.

I have received quite a few questions from people asking me things such as: "What is the difference between Disnix and NixOps?", "Is NixOps the successor of/a replacement for Disnix?", "Are Disnix and NixOps in competition with each other?"

The short answer is: while both tools perform distributed deployment tasks and use the Nix package manager as underlying (local) deployment technology, they are designed for different purposes and address different concerns. Furthermore, they can also be effectively used together to automate deployment processes for certain kinds of systems.

In the next chapters I will try to clarify the differences and explain how they can be used together.

Infrastructure and service deployment

NixOps does something that I call infrastructure deployment -- it manages configurations that work on machine level and deploys entire system configurations as a whole.

What Disnix does is service deployment -- Disnix is primarily designed for deploying service-oriented systems. What "service-oriented system" is exactly supposed to mean has always been an open debate, but a definition I have seen in the literature is "systems composed of platform-independent entities that can be loosely coupled and automatically discovered" etc.

Disnix expects a system to be decomposed into distributable units (called services in Disnix terminology) that can be built and deployed independently to machines in a network. These components can be web services (systems composed of web services typically qualify themselves as service-oriented systems), but this is not a strict requirement. Services in a Disnix-context can be any unit that can be deployed independently, such as web services, UNIX processes, web applications, and databases. Even entire NixOS configurations can be considered a "service" by Disnix, since they are also a unit of deployment, although they are quite big.

Whereas NixOps builds, transfers and activates entire Linux system configurations, Disnix builds, transfers and activates individual services on machines in a network and manages/controls them and their dependencies individually. Moreover, the target machines to which Disnix deploys are neither required to run NixOS nor Linux. They can run any operating system and system distribution capable of running the Nix package manager.

Being able to deploy services to heterogeneous networks of machines is useful for service-oriented systems. Although services might manifest themselves as platform independent entities (e.g. because of their interfaces), they still have an underlying implementation that might be bound to technology that only works on a certain operating system. Furthermore, you might also want to have the ability to experiment with the portability of certain services among operating systems, or effectively use a heterogeneous network of operating systems to use their unique selling points effectively for the appropriate services (e.g. a Linux, Windows, OpenBSD hybrid).

For example, although I have mainly used Disnix to deploy services to Linux machines, I also once did an experiment with deploying services implemented with .NET technology to Windows machines running Windows specific system services (e.g. IIS and SQL server), because our industry partner in our research project was interested in this.

To be able to do service deployment with Disnix one important requirement must be met -- Disnix expects preconfigured machines to be present running a so-called "Disnix service" providing remote access to deployment operations and some other system services. For example, to allow Disnix to deploy Java web applications, a predeployed Servlet container, such as Apache Tomcat, must already be present on the target machine. Also other container services, such as a DBMS, may be required.

Disnix does not automate the deployment of machine configurations (that include the Disnix service and containers) requiring people to deploy a network of machines by other means first and writing an infrastructure model that reflects the machine configurations accordingly.

Combining service and infrastructure deployment

To be able to deploy a service-oriented system into a network of machines using Disnix, we must first deploy a collection of machines running some required system services first. In other words: infrastructure deployment is a prerequisite for doing service deployment.

Currently, there are two Disnix extensions that can be used to integrate service deployment and infrastructure deployment:

  • DisnixOS is an extension that complements Disnix with NixOS' deployment features to do infrastructure deployment. With this extension you can do tasks such as deploying a network of machines with NixOps first and then do service deployment inside the deployed network with Disnix.

    Moreover, with DisnixOS you can also spawn a network of NixOS VMs using the NixOS test driver and run automated tests inside them.

    A major difference from a user perspective between Disnix and DisnixOS is that the latter works with network models (i.e. networked NixOS configurations used by NixOps and the NixOS test driver) instead of infrastructure models and does the conversion between these models automatically.

    A drawback of DisnixOS is that service deployment is effectively tied to NixOS, which is a Linux distribution. DisnixOS is not very helpful if a service-oriented system must be deployed in a heterogeneous network running multiple kinds of operating systems.
  • Dynamic Disnix. With this extension, each machine in the network is supposed to publish its configuration and a discovery service running on the coordinator machine generates an infrastructure model from the supplied settings. For each event in the network, e.g. a crashing machine, a newly added machine or a machine upgrade, a new infrastructure model is generated that can be used to trigger a redeployment.

    The Dynamic Disnix approach is more powerful and not tied to NixOS specifically. Any infrastructure deployment approach (e.g. Norton Ghost for Windows machines) that includes the deployment of the Disnix service and container services can be used. Unfortunately, the Dynamic Disnix framework is still a rough prototype and needs to become more mature.

Is service deployment useful?

Some people have asked me: "Is service deployment really needed?", since it also possible to deploy services as part of a machine's configuration.

In my opinion it depends on the kinds of systems that you want to deploy and what problems you want to solve. For some kinds of distributed systems, Disnix is not really helpful. For example, if you want to deploy a cluster of DBMSes that are specifically tuned for specific underlying hardware, you cannot really make a decomposition into "distributable units" that can be deployed independently. Same thing with filesystem services, as shown in the Trac example -- doing an NFS mount is a deployment operation, but not a really an independent unit of deployment.

As a sidenote: That does not imply that you cannot do such things with Disnix. With Disnix you could still encapsulate an entire (or partial) machine specific configuration as a service and deploy that, or doing a network mount by deploying a script performing the mount, but that defeats its purpose.

At the same time, service-oriented systems can also be deployed on infrastructure level, but this sometimes leads to a number of inconveniences and issues. Let me illustrate that by giving an example:

The above picture reflects the architecture of one of the toy systems (Staff Tracker Java version) I have created for Disnix for demonstration purposes. The architecture consists of three layers:

  • Each component in the upper layer is a MySQL database storing certain kinds of information.
  • The middle layer encapsulates web services (implemented as Java web applications) responsible for retrieving and modifying data stored in the databases. An exception is the GeolocationService, which retrieves data by other means.
  • The bottom layer contains a Java web application that is consulted by end-users.

Each component in the above architecture is a distributable service and each arrow denotes dependency relationships between them which manifest themselves as HTTP and TCP connections. Because components are distributable, we could, for example, deploy all of them to one single machine, but we can also run each of them on a separate machine.

If we want to deploy the example system on infrastructure level, we may end up composing a networked machine configuration that looks as follows:

The above picture shows a deployment scenario in which the services are divided over two machines a network:

  • The MySQL databases are hosted inside a MySQL DBMS running on the first machine.
  • The web application front-end and one of the web services granting access to the databases are deployed inside the Apache Tomcat Servlet container running on the first machine.
  • The remaining web services are deployed inside an Apache Tomcat container running on the second machine.

When we use NixOps to deploy the above machine configurations, then the entire machine configurations are deployed and activated as a whole, which have a number of implications. For example, the containers have indirectly become dependent on each other as can be seen in the picture below in which I have translated the dependencies from service level to container level:

In principle, Apache Tomcat does not depend on MySQL, so under normal circumstances, these containers can be activated in any order. However, because we host a Java web application that requires a database, suddenly the order in which these services are activated does matter. If we would activate them in the wrong order then the web service (and also indirectly the web application front-end) will not work. (In extreme cases: if a system has been poorly developed, it may even crash and need to be manually reactivated again!)

Moreover, there is another implication -- the web application front-end also depends on services that are deployed to the second machine, and two of these services require access to databases deployed to the first machine. On container level, you could clearly see that this situation leads to two machines having a cyclic dependency on each other. That means that you cannot solve activation problems by translating service-level dependencies to machine-level dependencies.

As a matter of fact: NixOps allows cyclic dependencies between machines and activates their configurations in arbitrary order and is thus incapable of dealing with temporarily or permanent inconsistency issues (because of broken dependencies) while deploying a system as shown in the example.

Another practical issue with deploying such a system on infrastructure level is that it is tedious to do redeployment, for example when a requirement changes. You need to adapt machine configurations as a whole -- you cannot easily specify a different distribution scenario for services to machines.

As a final note, in some organisations (including a company that I have worked for in the past) it is common practice that infrastructure and service deployment are separated. For example, one department is responsible for providing machines and system services and another department (typically a development group) responsible for building the services and deploying them to the provided machines.


In this blog post, I have described NixOps and elaborated on the differences between NixOps and Disnix -- the former tool does infrastructure deployment, while the latter does service deployment. Infrastructure deployment is a prerequisite of doing service deployment and both tools can actually be combined to automate both concerns.

Service deployment is particularly useful for distributed systems that can be decomposed into "distributable units" (such as service-oriented systems), but not all kinds of distributed systems.

Moreover, NixOps is a tool that has been specifically designed to deploy NixOS configurations, while Disnix can deploy services to machines running any operating system capable of running the Nix package manager.

Finally, I have been trying to answer three questions, which I mentioned somewhere in the middle of this blog post. There is another question I have intentionally avoided that obviously needs an answer as well! I will elaborate more on this in the next blog post.


More information about Disnix, service and infrastructure deployment, and in particular: integrating deployment concerns can be found in my PhD thesis.

Interestingly enough, during my PhD thesis defence there was also a question about the difference between service and infrastructure deployment. This blog post is a more elaborate version of the answer I gave earlier. :-)

Saturday, February 7, 2015

A sales pitch explanation of NixOS

Exactly one week ago, I have visited FOSDEM for the seventh time. In this year's edition, we had a NixOS stand to promote NixOS and its related sub projects. Promoting NixOS is a bit of challenge, because properly explaining its underlying concepts (the Nix package manager) and their benefits is often not that straight forward.

Explaining Nix

Earlier I have written two recipes explaining the Nix package manager, each having its pros and cons. The first recipe is basically explaining Nix from a system administrator's perspective -- it starts by explaining what the disadvantages of conventional approaches are and then what Nix does differently: namely storing packages in isolation in separate directories in the Nix store using hash codes as prefixes. Usually when I show this to people, there is always a justification process involved, because these hash codes look weird and counter-intuitive. Sometimes it still works out despite the confusion, sometimes it does not.

The other recipe explains Nix from a programming language perspective, since Nix borrows its underlying concepts from purely functional programming languages. In this explanation recipe, I first explain in what way purely functional programming languages are different compared to conventional programming languages. Then I draw an analogy to package managers. I find this the better explanation recipe, because the means used to make Nix purely functional (e.g. using hash codes) make sense in this context. The only drawback is that a large subset of the people using package managers are often not programmers and typically do not understand nor appreciate the programming language aspect.

To summarize: advertising the Nix concepts is tough. While I was at the NixOS stand, I had to convince people passing by in just a few minutes that it is worth to give NixOS (or any of its sub projects) a try. In the following section, I will transcribe my "sales pitch explanation" of NixOS.

The pitch

NixOS is a Linux distribution built around the Nix package manager solving package and configuration management problems in its own unique way. When installing systems running Linux distributions by conventional means, it is common to do activities, such as installing the distribution itself, then installing additional custom packages, modifying configuration files and so on, which is often a tedious, time consuming and error prone process.

In NixOS the deployment of an entire system is fully automated. Deployment is driven by a declarative configuration file capturing the desired properties of a system, such as the harddrive partitions, services to run (such as OpenSSH, the Apache webserver), the desktop (e.g. KDE, GNOME or Xfce) and end-user packages (e.g. Emacs and Mozilla Firefox). With just one single command-line instruction, an entire system configuration can be deployed. By adapting the declarative configuration file and running the same command-line instruction again, an existing configuration can be upgraded.

NixOS has a couple of other nice properties as well. Upgrading is always safe, so there is no reason to be worried that an interruption will break a system. Moreover, older system configurations are retained by default, and if an upgrade, for example, makes a system unbootable, you can always switch back to any available older configuration. Also configurations can be reproduced on any system by simply providing the declarative configuration file to someone else.

Several tools in the Nix project extend this deployment approach to other areas: NixOps can be used to deploy a network of NixOS machines in the cloud, Hydra is the Nix-based continuous integration server, Disnix deploys services into a network of a machines. Furthermore, the Nix package manager -- that serves as the basis for all of these tools -- can also be used on any Linux distribution and a few other operating systems as well, such as Mac OS X.

Concluding remarks

The above pitch does not reveal much about its technical aspects, but simply focuses itself on its key aspect -- fully automated deployment and some powerful quality properties. This often leads to more questions from people passing by, but I consider that a good thing.

This year's FOSDEM was a very nice experience. I'd like to thank all the fellow Nixers who did all the organisation work for the stand. As a matter of fact, apart from doing some promotion work at the stand I was not involved in any of its organizational aspects. Besides having a stand to promote our project, Nicolas Pierron gave a talk about NixOS in the distributions devroom. I also enjoyed Larry Wall's talk about Perl 6 very much:

I'm looking forward to see what next year's FOSDEM will bring us!

Thursday, January 29, 2015

Agile software development: my experiences

In a couple of older blog posts, I've reported about my experiences with companies, such as the people to whom I talked to at the LAC conference and my job searching month. One of the things that I have noticed is that nearly all of them were doing "Agile software development", or at least they claim to do so.

At the LAC conference, Agile seemed to be one of the hottest buzzwords and every company had its own success story in which they explained how much Agile methodologies have improved their business and the quality of the systems that they deliver.

The most popular Agile software development methodology nowadays is probably Scrum. All the companies that I visited in my job searching month, claimed that they have implemented it in their organisation. In fact, I haven't seen any company recently, that is intentionally not using Scrum or any other Agile software development methodology.

Although many companies claim to be Agile, I still have the impression that the quality of software systems and the ability to deliver software in time and within the budget haven't improved that much in general, although there are some exceptions, of course.

What is Agile?

I'm not an expert in Agile software development. One of the first things I wanted to discover is what "Agile" actually means. My feeling says that only a few people have an exact idea, especially non-native English speakers, such as people living in my country -- the Netherlands. To me, it looks like most of the developers with a Dutch mother tongue use this buzzword as if it's something as common as ordering a hamburger in a restaurant without consciously thinking about its meaning.

According to the Merriam Webster dictionary, Agile means:
: marked by ready ability to move with quick easy grace <an agile dancer>
: having a quick resourceful and adaptable character <an agile mind>

The above definition is a bit abstract, but contains a number of interesting keywords. To me it looks like if some person or object is Agile, then it has a combination of the following characteristics: quick, easy, resourceful, and adaptable.

Why do we want/need to be Agile in software development?

It's generally known that many software development projects partially or completely fail, because of many reasons, such as:
  • The resulting system is not delivered in time or cannot be delivered at all.
  • The resulting system does not do what the customer expects, a.k.a. mismatch of expectations. Apart from customers, this also happens internally in a development team -- developers may implement something totally different as a designer has intended.
  • There is a significant lack of quality, such as in performance or security.
  • The overall project costs (way) too much.

These issues are caused by many factors, such as:

  • Wrong estimations. It is difficult (or sometimes impossible) to estimate how much time something will take to implement. For example, I have encountered a few cases in my past career in which something took double or even ten times the amount of time that was originally estimated.
  • Unclarity. Sometimes a developer thinks he has a good understanding of what a particular feature should look like, but after implementing it, it turns out that many requirements were incorrectly interpreted (or sometimes even overlooked), requiring many revisions and extra development time.
  • Interaction problems among team members. For example, one particular developer cannot complete his task because of a dependency on another developer's task which has not been completed yet. Also, there could be a mismatch of expectations among team members. For example, a missing feature that has been overlooked by one developer blocking another developer.
  • Changing requirements/conditions. In highly competitive environments, it may be possible that a competitor implements missing features that a certain class of customers want making it impossible to sell your product. Another example could be Apple changing its submission requirements for the Apple Appstore making it impossible to distribute an application to iPhone/iPad users unless the new requirements have been met.
  • Unpredictable incidents and problems. For example, a team member gets sick and is unavailable for a while. The weather conditions are so bad (e.g. lots of snowfall) that people can't make it to the office. A production server breaks down and needs to be replaced by a new instance forcing the organisation to invest money to buy a new one and time to get it configured.
  • Lack of resources. There is not enough manpower to do the job. Specialized knowledge is missing. A server application requires much more system resources than expected, e.g. more RAM, more diskspace etc.

Ideally, in a software development project, these problems should be prevented. However, since this ideal is hard to achieve, it is also highly desirable to be able to respond to them as quickly as possible without too much effort, to prevent the corresponding problems to grow out of hand. That is why being Agile (e.g. quick, easy, resourceful, and adaptable) in software development is often not only wanted, but also necessary, in my opinion.

Agile manifesto

The "definition of Agile" has been "translated" to software development by a group of practitioners, into something that is known as the Agile manifesto. This manifesto states the following:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.

The Agile manifesto looks very interesting, but when I compare it to the definition of Agile provided by the Merriam Webster dictionary, I don't see any of its characterizing keywords (such as adaptable and easy) in the text at all, which looks quite funny to me. The only piece that has some kind of connection is Responding to change (that has a connection to adaptable), but that is pretty much everything I can see that it has in common.


This observation makes me wonder: How is the Agile manifesto going to help us to become more agile in software development and more importantly, how should we interpret it?

Because it states that the items on the left have more value than the items on the right, I have seen many people considering the right items not to be relevant at all. As a consequence, I have seen the following things happen in practice:

  • Not thinking about a process. For example, in one of my past projects, it was common to create Git branches for all kinds of weirdly related tasks in an unstructured manner because that nicely integrated with the issue tracker system. Furthermore, merging was also done at unpredictable moments. As a consequence, it often came together with painful merge conflicts that were hard to resolve making the integration process tedious and much more time consuming than necessary.
  • Not documenting anything at all. This is not about writing down every detail per se, but rather about documenting a system from a high level perspective to make the basics clear to everyone involved in a development process, such as a requirements document.

    I have been involved in quite a few projects in which we just started implementing something without writing anything down at all and "trust" that it eventually gets right. So far, it always took us many more iterations than if most of the simple, basic details would be clear from the beginning. For example, some basic domain knowledge that may sound obvious, may turn out not be that obvious at all.
  • Not having an agreement with the customer. Some of the companies I worked for did integration with third party (e.g. customer's) systems. What, for example, if you are developing the front-end and some error occurs because of a bug in the customer's system? Who's going to get blamed? Typically, it's you unless you can prove otherwise. Moreover, unclear communication may also result in wrong expectations typically extending the development time.
  • Not having a plan at all. Of course being flexible with regard to changes is good, but sometimes you also have to stick yourself to something, because it might completely alter the current project's objectives otherwise. A right balance must be found between these two, or you might end up in a situation like this. It also happened to me a few times when I was developing web sites as a teenager.

From my perspective, the Agile manifesto does not say that the emphasis should lie on the left items only. In fact, I think the right items are also still important. However, in situations where things are unclear or when pressure arises, then the item on the left should take precedence. I'm not sure if this is something the authors of the manifesto have intended to communicate though.

For example, while developing a certain aspect of a system, it would still make sense to me to write their corresponding requirements down so that everyone involved knows about it. However, writing every possible detail down often does not make sense because they are typically not known or subject to change anyway. In these kind of situations, it would be better to proceed working on an implementation, validate that with the stakeholders and refine the requirements later.

Same thing, for example, applies to customer collaboration (in my opinion). An agreement should be made, but of course, there are always unforeseen things that both parties did not know of. In such situations it is good to be flexible, but it should not come at any price.

Why agile?

What is exactly Agile about finding a right balance between these items? I think in ideal situations, having a formalized processed that exactly describes the processes, documentation that catches everything, a solid contract that does not have to be changed and a plan of which you know that works is the quickest and easiest path to get software implemented.

However, since unpredictable and unforeseen things always happen, these might get in your way and you have to be flexible. In such cases, you must be adaptable by giving the items on the left precedence. I don't see, however, what's resourceful about all of this. :-)

So is this manifesto covering enough to consider software development "Agile" if it is done properly? Not everybody agrees! For example, there is also the More Agile Manifesto that covers organisations, not teams. Kent Beck, one of the signatories of the Agile manifesto, wrote an evolved version. Zed Shaw considers it all to be nonsense and simply says that people should do programming and nothing should get in their way.

I'm not really a strong believer in anything. I want to focus myself on facts, rather than on an idealism.


As I have explained earlier, nearly all the companies that I visited during my job searching month as well as my current employer have implemented (or claim to have implemented) Scrum in their organisation. According to the Scrum guide, Scrum is actually not a methodology, but rather a process framework.

In a process implementing Scrum, development is iterative and divided into so-called sprints (that take up to 2-4 weeks). At the end of each sprint an increment is delivered that is considered "done". Each sprint has the following activities:

  • The Sprint planning is held at the beginning of each sprint in which the team discusses how and when to implement certain items from the product backlog.
  • Daily scrum is a short meeting held at the beginning of every development day in which team members briefly discuss the progress made the day before, the work that needs to be done the next 24 hours and any potential problems.
  • The Sprint review activity is held at the end of the sprint in which stakeholders review, reflect and demonstrate what is done. Furthermore, future goals are set during this meeting.
  • Finally, the Sprint retrospective meeting is held in which team members discuss what can be improved with regards to people, relationships, process, and tools in future sprints.

In a Scrum process, two kinds of "lists" are used. The product back log contains a list of items that need to be implemented to complete the product. The sprint back log contains a list of items reflecting the work that needs to be done to deliver the increment.

Teams typically consist of 3-9 persons. Scrum only defines three kinds of team member roles, namely the product owner (responsible for maintaining the product back log and validating it), the Scrum master (who guards to process and takes away anything that blocks developers) and developers.

The Scrum guide makes no distinction between specific developer roles, because (ideally) every team member should be able to take over each other's work if needed. Moreover, teams are self-organizing meaning that it's up to the developers themselves (and nobody else) to decide who does what and how things are done.

Why agile?

I have encountered quite a few people saying "Hey, we're doing Scrum in our company, so we're Agile!", because they appear to have some sort of a process reflecting the above listed traits. This makes me wonder: How is Scrum going to help and what is so agile about it?

In my opinion, most of its aspects facilitate transparency (such as the four activities) to prevent that certain things to go wrong or that too much time is wasted because of misunderstandings. It also facilitates reflection with the purpose to adapt and optimize the development process in future sprints.

Since Scrum only loosely defines a process, the activities defined by it (sort of) make sense to me, but also deliberately leaves some things open. As I have mentioned earlier, a completely predictable process would be the quickest and easiest way to do software development, but since that ideal is hard to achieve because of unpredictable/unforeseen events, we need some flexibility too. We must find a balance and that is what Scrum is (sort of doing) by providing a framework that still gives an adopter some degree of freedom.

A few things that came into my mind with regards to a process implementation are:

  • How to specify "items" on the product and sprint backlogs? Although the Scrum guide does not say anything on how to do this, I have seen many people using a so-called "user-story format" in which they describe items in a formalism like "As a <user role> I want to <do some activity / see something etc. >".

    From my point of view, a user story (sort of) reflects a functional or non-functional requirement or a combination of both. However, it is typically only an abstract definition of something that might not cover all relevant details. Moreover, it can also be easily misinterpreted.

    Some people have told me that writing more formal requirements (e.g. by adhering to a standard, such as the IEEE 830-1998 standard for software requirement specifications) is way too formal, too time consuming and "unagile".

    IMHO, I think it really depends on the context. In some projects, the resulting product has only simple requirements (that do not even have to be met fully) and in others more difficult ones. In the latter case, I think it pays off to think about requirements more thoroughly, than having to revise to product many times. Of course, a right balance must be found between specifying and implementing.
  • When is something considered "done"? I believe this is one of the biggest ongoing discussions within the Scrum community, because the Scrum guide intentionally leaves the meaning of this definition open to the implementer.

    Some questions that I sometimes think about are: Should something be demonstrated to stakeholders? Should it also be tested thoroughly (e.g. all automated test cases must pass and the coverage should be acceptable)? Can we simply run a prototype on a development machine or does the increment have to be deployed to a production environment?

    All these questions cannot be uniformly answered. If the sprint goal is a prototype then the meaning of this definition is probably different than a mission critical product. Furthermore, accomplishing all the corresponding tasks to consider something done might be more complicated than expected, e.g. software deployment is often a more difficult problem than people think.
  • How to effectively divide work among team members and how to compose teams? If for example, people have to work on a huge monolithic code base, then it is typically difficult to, for example, compose two teams working on it simultaneously because they might apply conflicting changes that slow things down and may break a system. This could also happen between individual team members. To counter this, modularization of a big codebase helps, but accomplishing this is all but trivial.
  • According to the Scrum guide, each developer is considered equal, but how can we ensure that one developer is capable of taking over another developer's work? That person needs to have the right skills and domain specific knowledge. For the latter aspect it is also important to have something documented, I guess.
  • How to respond to unpredictable events during a sprint? Should it be cancelled? Should the scope be altered?

In practice, I have not seen that many people consciously thinking about the implementation of certain aspects in a Scrum process at all. They are either too much concerned with the measurable aspects of a process, (e.g. is the burndown chart, that reflects the amount of work remaining, looking ok?), or the tools that are being used (e.g. should we add another user story?).

IMHO, Scrum solves and facilitates certain things that helps you to be Agile. But actually being Agile is a much broader and more difficult question to answer. Moreover, this question also needs to be continuously evaluated.


In this blog post, I have written about my experiences with Agile software development. I'm by no means an expert or a believer in any Agile methodology.

In my opinion, what being agile actually means and how to accomplish this is a difficult question to answer and must be continuously evaluated. There is no catch-all solution for being it.


I gained most of my inspiration for this blog post from my former colleague's (Rini van Solingen) video log named: "Groeten uit Delft" that covers many Scrum and Agile related topics. I used to work for the same research group (SERG) at Delft University of Technology.

Tuesday, December 30, 2014

Fourth annual blog reflection

Today it's exactly four years ago that I started this blog, so again it's an interesting opportunity to reflect over last year's writings.

Software deployment

As usual, the majority of blog posts written this year were software deployment related. In the mobile application area, I have developed a Nix function allowing someone build Titanium apps for iOS and Android, I revised the Nix iOS build function to use the new simulator facilities of Xcode 6, did some nice tricks to get existing APKs deployed in the Android emulator, and I described an approach allowing someone to do wireless ad-hoc distributions of iOS apps with Hydra, the Nix-based continuous integration server.

A couple of other deployment blog posts were JavaScript related. I have extended NiJS with support for asynchronous package specifications, which can be used both for compilation to Nix expressions or standalone execution by NiJS directly. I advertised the improved version as the NiJS package manager and successor of the Nix package manager on April fools day. I received lots of hilarious comments that day! Some of them included thoughts and comments that I could not possibly think of!

The other JavaScript related deployment blog post was about my reengineering effort of npm2nix that generates Nix expressions from NPM package specifications. The original author/maintainer relinquished his maintainership, and I became a co-maintainer of it.

I also did some other deployment stuff such as investigating how Nix and Hydra builds can be backed up and describing how packages can be managed outside the Nixpkgs tree.

Finally, I have managed to get a more theoretical blog post finished earlier today in which I explore some terminology and mappings between them to improve software deployment processes.

IFF file format experiments

I also spent a bit of time on my fun project involving IFF file formats. I have ported the ILBM viewer and 8SVX player applications from SDL 1.2 to 2.0. I was a bit puzzled by one particular aspect -- namely: how to continuously render 8-bit palettized surfaces, so I have decided to write a blog post about it.

Another interesting thing I did is porting the project to Visual C++ so that they can be run on Windows natively. I wrote a blog post about a porting strategy and improvement to the Nix build function that can be used to build Visual Studio projects.


Although I have left academia there is still something interesting to report about research this year. In the past we have worked on a dynamic build analysis approach to discover license constraints (also covered in Chapter 10 of my PhD thesis). Unfortunately, all the paper submission attempts we did were rejected and eventually we gave up publishing it.

However, earlier in April this year, one of our peers decided to give it another shot and got Shane McIntosh on board. Shane McIntosh and me have put a considerable amount of effort in improving the paper, which we titled: "Tracing software build processes to uncover license compliance inconsistencies". We submitted the improved paper to ASE 2014. Good news is: the paper got accepted! I'm glad to find out that someone can show me that I can be wrong sometimes! :-)

Miscellaneous stuff

I also spent some time on reviving an old dormant project helping me to consistently organise website layouts because I had found some use for it, and to release it as free and open source software on GitHub.

Another blog post I'm very proud of is about structured asynchronous programming in JavaScript. From my experience with Node.js I observed that to make server applications work smoothly, you must "forget" about certain synchronous programming constructs and replace them by asynchronous alternatives. Besides the blog post, I also wrote a library implementing the abstractions.

Blog posts

As with my previous annual reflections, I will also publish the top 10 of my most frequently read blog posts:

  1. On Nix and GNU Guix. As with the previous two annual reflections, this blog post remains on top and will probably stay at that position for a long time.
  2. An alternative explanation of the Nix package manager. Also this blog post's position remains unchanged since the last two reflections.
  3. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). This blog post has moved to the third position and that's probably because of the many ongoing discussions on the Internet about Nix and the FHS, and the discussion whether NixOS can run Steam.
  4. Using Nix while doing development. This post also gained a bit of more popularity since last year, but I have no idea why.
  5. Setting up a Hydra build cluster for continuous integration and testing (part 1). A blog post about Hydra from and end user perspective that still remains popular.
  6. Setting up a multi-user Nix installation on non-NixOS systems. This blog post is also over one year old and has entered the all time top 10. This clearly indicates that the instructions in the Nix manual are still unclear and this feature is wanted.
  7. Asynchronous programming with JavaScript. Another older blog post that got some exposure on some discussion sites and entered the all time top 10 as a consequence.
  8. Second computer. Still shows that the good ol' Amiga remains popular! This blog post has been in the all-time top 10 since the first annual blog reflection.
  9. Yet another blog post about Object Oriented Programming and JavaScript. Yet another older blog post that was suddenly referenced by a Stackoverflow article. As a consequence, it entered the all time top 10.
  10. Wireless ad-hoc distributions of iOS applications with Hydra. This is the only blog article I wrote this year that ended up in the all-time top 10. Why it is so popular is a mistery to me. :-)


I'm still not out of ideas and there will be more stuff to report about next year, so stay tuned! The remaining thing I'd like to say is:


On the improvement of software deployment processes and some definitions

Some time ago, I wrote a blog post about techniques and lessons to improve software deployment processes. The take-home message of this blog post was that in order to improve deployment processes, you must automate everything from the very beginning in a software development process and properly decompose the process into sub units to make the process more manageable and efficient.

In this blog post, I'd like to dive a bit deeper into the latter aspect by exploring some definitions of "decomposition units" in the literature and by deriving mappings between them.

Software projects

The first "definition" that I want to mention is the software project, for which I (interestingly enough) could not find anything in the literature. The reason why I start with this term is that software deployment issues often already appear in the early stages of a software development process.

The term "software project" is something which is hard to define formally IMHO. To me they typically manifest themselves as directories of files that I can divide into the following categories:

  • Executable code. Files typically containing code implementing a program that performs computation and manipulates data.
  • Resources/data. Files not implementing anything that is executed, which are used or referenced by the program, such as images, configuration files, video, audio, HTML pages, etc.
  • Build configuration files. Configuration files used by a build system that transform or change the files belonging to the earlier two categories.

    For example, executable code is often implemented in higher level programming languages and must be compiled to object code so that the program can be executed. Also many kinds of other processing steps can be executed, such as scaling images to lower resolutions, obfuscating/minifying code, running a style checker, bundling object code and resources etc.

Sometimes it is hard to draw a hard line between executable code and data files. For example, it may be possible that a data artifact (e.g. an HTML page) includes executable code (e.g. embedded JavaScript), and the other way around, such as assembly code containing strings in their code sections for efficiency.

Software projects can often be conveniently created by an Integrated Development Environment (IDE) that typically provides useful templates and automatically fills in many boilerplate settings. However, for small projects, people frequently create software projects manually, for example, by manually creating a directory of source files with a Makefile.

It is probably obvious to notice that dealing with software deployment complexity requires automation and files belonging to the third category (build configuration files) must be provided. Yet, I have seen quite a few projects in the past in which nothing is automated and people still rely on manually executing executing build tasks in an IDE, which is often tedious, time consuming and error prone.

Software modules

An automated build process of a software project provides a basic and typically faster means of (re)producing releases of a software product and is often less error prone than a manual build process.

However, besides build process automation there could still be many other issues. For example, if a software project has a monolithic build structure in which nothing can be built separately, deployment times become unnecessarily long and their configurations often have a huge maintenance complexity. Also, upgrading an existing deployment is typically difficult, expensive and unreliable.

To improve the efficiency of build processes, we need to decompose them into units that can be built separately. An import prerequisite to accomplish build decomposition is functional separation of important aspects of a software project.

A relatively simple concept supporting functional separation is the software module. According to Clemens Szyperski's "Component Software" book, a software module is a unit that has the following characteristics:

  • A module implements an ADT (Abstract Data Type).
  • Encapsulates multiple entities, often classes, but sometimes other kinds of entities, such as functions.
  • Have no concept of instantiation, in other words: there is one and only one instance of a module.

Several programming languages have a notion of modules, such as Module-2, Ada, C# and Java (since version 9). Sometimes the module concept is named differently in these languages. For example, in Ada modules are called packages and in C# they are called assemblies.

Not all programming languages support modular programming. Sometimes external facilities must be used, such as CommonJS in JavaScript. Moreover, modules can also be "simulated" in various ways, such as with static classes or singleton objects.

Encapsulating functionality into modules also typically imposes a certain filesystem structure for organizing the source code files. In some contexts, a module must correspond to a single file (e.g. in CommonJS) and in others to directories of files following a certain convention (e.g. in Java the names of directories should correspond to the package names, and the names of regular files to the name of the enclosing type in the code). Sometimes files belonging to a module can also be bundled into a single archive, such as a Zip container (e.g. a JAR file) or library file (e.g. *.dll or *.so files).

Refactoring a monolithic codebase into modules in a meaningful way is all but trivial. According to the paper "On the criteria to be used in decomposing systems into modules" written by David Parnas, it is a good practice to minimize coupling between modules (i.e. the dependencies between modules should be minimized) and maximize cohesion within modules (i.e. strongly related things should belong to the same module).

Software components

The biggest benefit of modularization is that parts of the code can be effectively reused. Reuse of software assets can be improved even further by turning modules (that typically work on code level) into software components that work on system level. Clemens Szyperski's "Component Software" book says the following about them:
The characteristic properties of a component are that it:

  • is a unit of independent deployment
  • is a unit of third-party composition
  • has no (externally) observable state

The above characteristics have several implications:

  • Independent deployment means that a component is well separated from the environment and other components, never deployed partially and third parties should not require access to its construction details.
  • To allow third-party composition a component must be sufficiently self contained and have clear specifications of what it provides and what it requires. In other words, they interact with the environment with well defined interfaces.
  • No externally observable state means that no distinction can be made between multiple copies of components.

So in what way are components different than modules? From my point of view, modularization is a prerequisite for componentization and some modules may already qualify themselves as minimal components.

However, some notable differences between modules and components is that the former are allowed to have observable state (e.g. having global variables that are imperatively modified) and dependencies on implementations rather than interfaces.

Furthermore, to implement software components standardized component models are frequently used, such as CORBA, COM, EJB, or web services (e.g. SOAP, WSDL, UDDI) that provide various kinds of facilities, such as (some sort of) a platform independent interface, lookup and discovery. Modules typically use the interface facilities provided by a programming language.

Build-Level Components

Does functional separation of a monolithic codebase into modules and/or components also improve deployment? According to Merijn de Jonge's IEEE TSE paper titled: "Build-Level components" this is not necessarily true.

For example, it may still be possible that source code files implementing modules or components on a functional level, are scattered across directories of source code files. For example, between the directories in a codebase, many references may exist (strong coupling) and directories often contain too many files (weak cohesion).

According to the paper, strong coupling and weak cohesion on the build level have the following disadvantages:
  1. potentially reusable code, contained in some of the entangled modules, cannot easily be made available for reuse;
  2. the fixed nature of directory hierarchies makes it hard to add or to remove functionality;
  3. the build system will easily break when the directory structure changes, or when files are removed or renamed.

In the paper, the author shows that Component-Based Software Engineering (CBSE) principles can be applied to the build level as well. Build-Level components can be formed by directories of source files and serve as a unit of composition. Access occurs via build, configuration, and requires interfaces:

  • The build interface defines which build operations to execute. In a GNU Autotools project following the GNU Coding Standards (used in the paper), these operations correspond to a number standardized make targets, e.g. make all, make install, make dist.
  • The configuration interface defines which variability points and parameters can be enabled or disabled. In a GNU Autotools project, this interface correspond to the --enable-foo and --disable-foo parameters passed to the configure script -- each enable or disable parameter defines a certain feature that can be enabled or disabled.
  • The requires interface can be used to bind dependencies to components. In a GNU Autotools project, this interface correspond to the --with-foo and --without-foo parameters passed to the configure script that take the paths to the corresponding dependencies as parameters allowing the configuration script to find it.

Although the paper only uses GNU Autotools-based for implementation purposes, build-level components are not restricted to any build technology -- the only thing that matters is that the operations for these three interfaces are standardized so that any component can be configured, composed, and built uniformly.

The paper describes a collection of smells and some refactor patterns that need to be applied to turn directories of source files into build level components. The rules mentioned in the paper are the following:
  1. Components with directory granularity
  2. Circular dependencies should be prevented
  3. Software building via standardized build interface
  4. Compile-time variability binding via standardized configuration interface
  5. Late binding of dependencies via require interface
  6. Build process definition per component
  7. Configuration process definition per component
  8. Component deployment with build level-packages
  9. Automated component composition

Software packages

As described in the previous sections, functional separation is a prerequisite to compose build level components. One important aspect of build-level components is that build processes of modules and components are separated. But how does build separation affect the overall deployment process (to which the build phase also belongs)?

Many deployment processes are typically carried out by tools called package managers. Package managers install units that are called software packages. According to the paper: "Package Upgrades in FOSS Distributions: Details and Challenges" written by Di Cosmo et al (HotSWUp 2008), a software package can be defined as follows:
Packages are abstractions defining the granularity at which users can act (add, remove, upgrade, etc.) on available software.

According to the paper a package is typically a bundle of 3 parts:

  • Set of files. Contains all kinds of files that must be copied somewhere to the host system to make the software work, such as scripts, binaries, resources etc.
  • Set of valued meta-information. Contains various kinds of meta attributes, such as the name of the package, the version, a description and its license. Most importantly, it contains information about the inter-package relationships which includes a set of dependencies on other packages and a set of conflicts with other packages. Package managers typically install its required dependencies automatically and refuses to install if a conflict has been encountered.
  • Executable configuration scripts (also known as maintainer scripts). These are basically scripts that imperatively "glue" files from the package to files already residing on the system. For example, after a certain package has been installed, some configuration files of the host system are automatically adapted so that it can be used properly.

Getting a software project packaged typically involves defining the meta data (including the dependencies/conflicts on external packages), bundling the build process (for source package managers) or the resulting build artifacts (for binary package managers), and composing maintainer scripts taking care of the remaining bits to make the package work (although I would personally not recommend using these kinds of scripts).

This process already works for big monolithic software projects. However, it has several drawbacks for these kinds of projects. Since it needs to deploy a big project as a whole, deployment is typically an expensive process. Not only a fresh installation of a package takes time, but also upgrading, since it has to replace an existing installation as a whole instead of the affected areas only.

Moreover, upgrading is also quite dangerous. Many package managers typically replace and remove files belonging to a package that reside in global locations on the filesystem, such as /usr/bin, /usr/lib (on Linux) or C:\WINDOWS\SYSTEM32 (on Windows). If an upgrade process gets interrupted, the system might reach an inconsistent state for which it might be difficult (or impossible) to do a rollback. The bigger a project is the more severe the potential damage becomes.

Packaging smaller units of a software project (e.g. a build-level component) is typically more work, but also has great benefits. It allows certain, smaller pieces of a software projects to be replaced separately, significantly increasing the efficiency and reliability of the upgrades. Moreover, the dependencies of software components and build-level components have already been identified and only need to be translated to the corresponding packages that provide them.

Nix packages

I typically use the Nix package manager (and related tools) for deployment activities. It borrows concepts from purely functional programming languages to make deployment reliable, reproducible and efficient.

In what way do packages deployed by Nix conform to the definition of software package shown earlier?

Deployment in Nix is driven by build recipes (called Nix expressions) that build packages including all its dependencies from source. Every package build (indirectly) invokes the derivation {} function that composes an isolated environment in which builds are executed in such a way that only the declared dependencies can be found and anything else cannot influence the build. The function arguments include package metadata, such as a description, license, maintainer etc. and the package dependencies.

References to dependencies in Nix are exact meaning that they bind to specific builds of other Nix packages. Conventional package managers, software components and build-level components typically use nominal version specifications consisting of the names and version numbers of the packages, which are less strict. Mapping nominal dependencies to exact dependencies is not always trivial. For example, nominal version ranges are unsupported in Nix and must be snapshotted. In an earlier blog post that describes how to deploy NPM packages with Nix has more details about this.

Another notable trait of Nix is that is has no notion of conflicts. In Nix, any package can coexist with another because they are all stored in isolated directories. However, conflicts may also indicate runtime conflicts between two packages. These kinds of issues need to be solved by other means.

Finally, Nix packages have no configuration (or maintainer) scripts, because they imperatively modify the system's state which conflicts with its underlying purely functional deployment model. Many things that configuration scripts typically do are accomplished in a different way if Nix is used for deployment. For example, configuration files are not adapted, but generated in a Nix expression and deployed as a Nix package. Service activation is typically done by generating a job description file (e.g. init script or systemd job) that starts and stops it.

NixOS configurations, Disnix configurations, Hydra jobsets

If something is packaged in a Nix expression you could easily broaden the application area of deployment:

  • With a few small modifications (mainly encapsulating several packages into a jobset), a Nix package can be turned into a Hydra jobset, so that a project can be integrated and tested continuously.
  • A package can be referenced from a NixOS module that, for example, automatically starts and stops a package on startup and shutdown. NixOS can be used to deploy entire system configurations from a single declarative specification in which the module is enabled.
  • A collection of NixOS configurations can also be deployed in a network of physical or virtual machines through NixOps.
  • A package can be turned into service by adding a specification of inter-dependencies (services that may reside on other machines in a network). These services can be used to compose a Disnix configuration that deploys services to machines in a network.


I can summarize all the terms described in this blog post and the activities that need to be performed to implement them in the following chart:

Concluding remarks

In this blog post, I have described some terminology and potential mappings between them with the purpose of defining a reengineering process that makes deployment processes more manageable and efficient.

The terms and mappings used in this blog post are quite abstract. However, if we make a number of concrete technology choices, e.g. a programming language (Java), component technology (web services), package manager (Nix), we can define a more concrete process allowing someone to make considerable improvements.

Moreover, the terms described in this blog post are idealistic. In practice, most units that are called modules or components do not fully qualify themselves as such, while it is still possible to package and deploy them individually. Perhaps, it would also be useful to make "weaker" definitions of some of the terms described in this blog post and to look for their corresponding minimum requirements.

Finally, we can also look into more refactor/reengineering patterns for the other terms and possible automation of them.