Presentation on AWS VPC technology

I recently prepared slides for a presentation on AWS VPC technology. Attached is the presentation which I prepared.

Create AMI/Image from scratch for EC2/Xen

This blog captures the steps required to create an image from scratch which can be used on Xen virtualization platform using PvGrub boot manager. The later half of the blog also highlights steps which can be used to convert this image into an EC2 AMI and can then be used to boot an EC2 instance.

The instructions below have been tested on Ubuntu 13.10 and builds a Ubuntu 13.10 image/AMI.

1) Create image file and mount it

# creating 1GB image here
# image size should be bigger than disk required for AMI
# size of root device is chosen when launching instance and not here
dd if=/dev/zero of=linux.img bs=1M count=1024 sudo losetup /dev/loop0 linux.img sudo mkfs.ext4 /dev/loop0 sudo mount /dev/loop0 /mnt

2) Install base system

sudo apt-get -y install debootstrap
# Installing 64-bit Ubuntu saucy (13.10) here
# Modify for your use case
sudo debootstrap --arch=amd64 saucy /mnt

3) Chroot into new installed system to configure it

sudo chroot /mnt

4) Configure basic system

mount none /proc -t proc
mount none /sys -t sysfs
# Adding root mount point
cat << EOF > /etc/fstab
/dev/xvda1   /       ext4    defaults        0   1
# Adding saucy specific apt sources here
cat << EOF > /etc/apt/sources.list
deb saucy main
deb-src saucy main
deb saucy-updates main
deb-src saucy-updates main
deb saucy-security main
deb-src saucy-security main
# setting eth0 for dhcp
cat << EOF >> /etc/network/interfaces
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
apt-get update
# installing ssh-server for headless setup
apt-get -y install openssh-server

5) Setup kernel and system to boot - This was the most tricky part of the setup to learn/find out/get right. PvGrub essentially reads /boot/grub/menu.lst file and find the kernel and initrd information from the file. There is no need to create boot record or boot sectors or anything similar when booting in Xen with PvGrub manager. However, one does have control over what kernel will be running and a kernel needs to be installed.

# install linux kernel
# don't install grub on any disk/device when prompted
# commands requires manual prompt, haven't scripted next step apt-get -y install linux-image-virtual # remove grub which was installed during kernel install # choose to remove grub from /boot/grub when prompted
# commands requires manual prompt, haven't scripted next step
apt-get -y purge grub2 grub-pc grub-common
# install grub-legacy-ec2 which is NOT a ec2 specific package
# this package creates /boot/grub/menu.lst file
# this package applies to all PvGrub guests, even outside ec2
apt-get -y install grub-legacy-ec2

6) Do custom configuration

# set up root password
# or create users with sudo access

7) unmount filesystems & loop devices

# exit out of chroot /mnt
sudo umount /mnt/sys
sudo umount /mnt/proc
# if umount fails, you might have to force it by umount -l
# umount will fail if any daemon processes were started in chroot
sudo umount /mnt
# losetup will fail till daemon process are still running
# kill any chroot daemon processes if needed
sudo losetup -d /dev/loop0

Your new linux.img file is ready for use. The next set of commands are specific to converting this image file into an AMI which can be used on EC2.

a) Transfer the image to a running EC2 instance

b) Mount a new EBS volume on the EC2 instance. Size of the EBS volume should be greater than size of image file chosen at step (1).

c) Copy image file contents into EBS volume

# assuming that the empty EBS drive is on /dev/xvdf
sudo dd if=linux.img of=/dev/xvdf bs=1M

All the steps listed below can also be done in AWS console

d) Create a snapshot of the EC2 instance

ec2addsnap -O AWS_KEY -W AWS_SECRET -d `date +"%Y%m%d%H%M%Z"` 

e) Register a new AMI from the snapshot

# Find the latest PvGrub kernel offered by Amazon
# hd0 and hd00 are the same thing
# choose 32-bit kernel or 64 bit kernel based on image
# kernel-id looks like "aki-919dcaf8" (1.04-x86_64)
ec2-describe-images -O AWS_KEY -W AWS_SECRET -o amazon --filter "name=pv-grub-*.gz"
# registering a 64 bit AMI here
ec2reg -O AWS_KEY -W AWS_SECRET --kernel KERNEL_ID -a x86_64 -n `date +"%Y%m%d%H%M%Z"` -b "/dev/sda1=SNAPSHOT_ID"

f) Test your new AMI by starting an instance with it


Secure data at rest with encryption

Encryption is critical to protecting data. Data should be encrypted both at rest and in transit. Data in transit can be encrypted using TLS/HTTPS. This blog talks about storing data securely using encryption. The solution listed below uses dm-crypt kernel module.

1) First, we will randomize the data on the disk to ensure that if someone gets access to the actual disk, is unable to determine the length/size of the encrypted data. The disk sectors can be randomized by overwriting them with random data. This step is important and shouldn’t be overlooked. Without randomization of device blocks, someone can assert that the disk is encrypted. Once the disk sectors have been randomized, it gives you plausible deniable encryption.

Your disk might have been initialized with a specific pattern before you got it (say with zeros). Once your start using the disk, you disk will look like this


However, if you randomize the disk space before storing data, no-one will be able to tell the difference between encrypted/filled space and randomized/empty space.


dd if=/dev/urandom of=/dev/sdb bs=1M

2) First decision related to encryption is the selection of the algorithm and the key length. For this setup, we will use AES with 256 bit key.

In the next step, we are generating a 256 bit (32 bytes) key from /dev/random. We are using /dev/random instead of /dev/unrandom for more security. However, on headless servers, random data generation can be slow and hence the next command may take some time.

head -c 32 /dev/random > 256bitKey

The key generated here should be kept secure, safe and away from the computer where the encrypted disk is mounted. If this key is lost, there is no way to recover the data stored on the encrypted disk.

3) We will use the key generated and create a mapped device such that data is encrypted when stored on /dev/sdb but decrypted when viewed from /dev/mapper/secretFs

sudo cryptsetup -c aes-cbc-essiv:sha256 --key-file 256bitKey create secretFs /dev/sdb

The mapped device can be used like any disk device and can be used directly or as part of LVM etc.

4) Let’s format the device and create an ext4 filesystem on it

sudo mkfs.ext4 /dev/mapper/secretFs

5) The formatted disk can now be mounted so that the applications can write data to the disk

sudo mount /dev/mapper/secretFs /mnt/secure


Here are the consolidated steps to prepare a new disk, set up encryption and mount it

dd if=/dev/urandom of=/dev/sdb bs=1M
head -c 32 /dev/random > 256bitKey
sudo cryptsetup -c aes-cbc-essiv:sha256 --key-file 256bitKey create secretFs /dev/sdb
sudo mkfs.ext4 /dev/mapper/secretFs
sudo mount /dev/mapper/secretFs /mnt/secure

If you reboot the server or unmount the disk, the disk can be re-mounted using the following steps

sudo cryptsetup -c aes-cbc-essiv:sha256 --key-file 256bitKey create secretFs /dev/sdb
sudo mount /dev/mapper/secretFs /mnt/secure

SSH Host Identification and Verification

When connecting to a SSH server, you would have often come across a prompt which asks you to confirm the host fingerprint before connecting. Many of the users might have picked up a bad habit of saying yes without understanding the implications of the decision. This blog is to highlight how ssh setup can be strengthened to include host identification (similar to HTTPS certificate signing). The benefit of such a setup is that the client can be confident that it is connecting to the right server and not becoming victim of MITM (man in the middle) attack. 

Often when connecting to a SSH server, you would see a prompt like the following

user@client:~$ ssh user@remoteserver
The authenticity of host 'remoteserver (' can't be established.
ECDSA key fingerprint is dd:30:96:8a:46:78:76:0a:53:7d:9d:0d:23:d6:89:ce.
Are you sure you want to continue connecting (yes/no)?

The presence of the prompt indicates the client can’t confirm the authenticity of the host. This problem can be solved by creating a certificate authority, which will sign the host keys (similar to CA concept in HTTPS). A certificate authority can be created for ssh using the following

user@ca-server:~$ ssh-keygen -f ca

The step above will create 2 files, ca and Contents of file ca are private and should be kept secret. The contents of are public and can be distributed freely.

As a part of creating new server, each server should have a unique set of host identification keys. They keys are usually automatically generated as a part of install. If not, such keys can be generated using the following steps.

root@ssh-server:~$ ssh-keygen -t rsa -q -N '' -f /etc/ssh/ssh_host_rsa_key
root@ssh-server:~$ ssh-keygen -t dsa -q -N '' -f /etc/ssh/ssh_host_dsa_key
root@ssh-server:~$ ssh-keygen -t ecdsa -q -N '' -f /etc/ssh/ssh_host_ecdsa_key

The above step generates private/public key pair for rsa, dsa and ecdsa algorithms. The public host keys can be copied over to ca-server for signing. ca-server is the server where ca private key is stored securely. The CA can sign the host keys using ca private key with the following command

user@ca-server:~$ ssh-keygen -s ca -I "remoteserver" -n remoteserver,remoteserver.domain, -h

The command is signing the host to be able to represent server remoteserver or remoteserver.domain or The certificate will not match if this certificate is presented for any other hostname. The above command will generate files * which should be copied back to the ssh server. Further sshd configuration (usually /etc/ssh/sshd_config) should be modified to present HostCertificates to client during ssh handshake. SSH configuration file on ssh server looks like the following

user@ssh-server:~$ cat /etc/ssh/sshd_config
# ssh daemon configuration HostKey /etc/ssh/ssh_host_rsa_key HostKey /etc/ssh/ssh_host_dsa_key HostKey /etc/ssh/ssh_host_ecdsa_key # new lines added HostCertificate /etc/ssh/ HostCertificate /etc/ssh/ HostCertificate /etc/ssh/
# further SSH configuration follows ...

Now the ssh server has its keys signed by CA. In the case of HTTPS, we have certain root certs which the browser trusts by default. However, there are no root certs in case of SSH. SSH client needs to explicitly configured to trust the CA. All ssh clients can be configured to trust the CA by putting the CA public key in SSH known hosts (/etc/ssh/ssh_known_hosts) configuration file 

user@client:~$ cat /etc/ssh/ssh_known_hosts
@cert-authority * ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCm+Bdq1eYGvddoWPRmJ43id7MioLeyRlOPNIJeuScHGMQro6jUYU4JyKx9dpKlQrZmn+hZeDxgx4fbxAQFKfdfgaLrFX3N06dR8uAFk7g+oimNJITWnaUgOuHGJrGEKIpNUqeLboOXm5aaYkiCH1ixx4r8hVIT4J+OM66oUZZmYTwWmxkxjj2Cu+Iuil7rpNzhjz9IVEzJrQA0KdpnfGQqv2KuaAhCCq6reZMoutE60HBX1Cww7Y3O26psp2AnL+xV5BzfhWYEdt98+Bz+WR/3Mt2u3NSv/ABwHZD3qseRFcWXnJGj9PbUAWAO6klMDqk9ok1nlmT0FjLbNk/R/gfh

Once done with all the steps above, you should be able to ssh from client machine to any server without facing the ssh host identification warning. The client trusts the CA and trusts the cert presented by the host when the cert is signed by CA and the cert name matches the hostname client is trying to connect to.

Like with any CA, SSH also has provision to revoke host keys when needed. However, there is no provision for a central revocation list (like in HTTPS). Such revocation information needs to be present in all client machines. The following snippet shows how to revoke public key of a server in a client.

user@client:~$ cat /etc/ssh/ssh_known_hosts
@cert-authority * ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCm+Bdq1eYGvddoWPRmJ43id7MioLeyRlOPNIJeuScHGMQro6jUYU4JyKx9dpKlQrZmn+hZeDxgx4fbxAQFKfdfgaLrFX3N06dR8uAFk7g+oimNJITWnaUgOuHGJrGEKIpNUqeLboOXm5aaYkiCH1ixx4r8hVIT4J+OM66oUZZmYTwWmxkxjj2Cu+Iuil7rpNzhjz9IVEzJrQA0KdpnfGQqv2KuaAhCCq6reZMoutE60HBX1Cww7Y3O26psp2AnL+xV5BzfhWYEdt98+Bz+WR/3Mt2u3NSv/ABwHZD3qseRFcWXnJGj9PbUAWAO6klMDqk9ok1nlmT0FjLbNk/R/gfh @revoked * ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBHCGEppabLQm/8J8OXzp6VNRAXX/7hXcvsLXD5apKxVT8VY9B8rB6o/1Iyw9qXuRi5k5cPfF29mNEm1XVYz9znU=

Secure SSH with Multi Factor authentication

Logging into a machine is essentially about proving your identity to the server. The human logging in needs to prove to the server that he/she is the holder of the account the session is trying to access.

Identity in general can be proved using the following methods

  • Something you have (like hardware token device, access card, public/private keys)
  • Something you know (like password, social security number, birth date)
  • Something you are (physically defines you, like fingerprint)

Most of the linux servers I deal with are headless and don’t really have a bio-metric device attached to them to assert something you are. So, I will be focusing on the first 2 assertions.

Multi factor authentication essentially means asking user to verify his/her identity using two different methods like something you have and something you know.  

The rest of the blog focuses on enabling various security mechanisms in a Linux server

a) Public/Private Key Authentication

Public key authentication falls in something you have but they have a weakness. Something you have should be tamper proof and hard to copy. However, private key files can be copied and can often be done without the user knowing that his/her key has been copied away. However, they offer better protection than password in attacks since we typically 2048 bit keys. 

A private key can be generated easily using the following linux command

ssh-keygen -f file.rsa -t rsa -b 2048 -q -N ''

The command will produce a 2048 bit private RSA key and store that in a text file (file.rsa here). The private key can be potentially protected with a passphrase by either specifying it at command (-N ‘<passphrase>’) or dropping -N parameter and specifying the passphrase as input in the tty session. The file created above is the private key and must be kept secure. The corresponding public key can be generated using

ssh-keygen -y -f file.rsa

The output is the public key which can be distributed freely. The public key is added/appended to ~<user>/.ssh/authorized_keys file on the server. This directs SSH server to accept the person in possession of the private key to log in as <user>.

Public Key Authentication can be enabled in a linux server by ensuring that the following is present in /etc/ssh/sshd_config

PubkeyAuthentication yes

b) Password Authentication

Passwords are the oldest security mechanism. However, people choose simple passwords. Even a random 8 character case sensitive alphanumeric (a-zA-Z0-9) password has only 48 bits of entropy which isn’t too hard (1 week) for a botnet to crack. However, most of us don’t use random passwords and easy to remember passwords have much less entropy.

Standard password can be enabled in linux server by making sure that sshd config (/etc/ssh/sshd_config) contains the following

PasswordAuthentication yes

c) TOTP Authentication

A better something you have example is a TOTP hardware device. A phone running TOTP app (like Google Authenticator) is not 100% tamper proof if running on a rooted Android device or if the underlying OS has security related bugs where other application might be able to steal the secret. A hardware key is more tamper proof and is a better choice but phone running TOTP app is also good.

Time based One Time Password (TOTP) algorithm generates a rolling password every X seconds (default 30). The password is based on 2 factors - a secret key which is known to both server and user and time. Given that time is a factor, it is critical that both server and client are using a synchronized clock.

While TOTP is a good something you have example, it has some flaws when it comes to a company wide deployment
a) The algorithm uses a shared secret. This means that the server knows the secret and hence it must be guarded on the server. Anyone who gains root access to the server can find out the secret for all the other users on the system.
b) The one time password is good for X seconds (default 30). This makes a man in the middle (MITM) attack easy. Anyone who intercepts or sees this password can use it for the next 30 seconds.

The flaws can be fixed by having a central TOTP authentication server which all the servers connect for TOTP validation. The central server can prevent MITM attack by enabling only login per verification code. This however limits the user to be able to log into a single server per X seconds.

TOTP authentication can be enabled on a linux server by running the following commands

# install pam module developed by google 
# which enables TOTP validation apt-get install libpam-google-authenticator # Run the google authenticator command to create new secret google-authenticator # copy the QR code or secret to a smartphone running a TOTP app
# (like Google Authenticator)

Further, ssh pam configuration (/etc/pam.d/sshd) should be modified to add the following line

auth required

This will require user to enter TOTP token (something you have) along with password (something you know)

Enabling combination of these methods

Openssh 6.2 added a new feature where user can be required to pass multiple validations before successful login. Here are some scenarios with the configuration

1) PublicKey + TOTP

Its not a desirable configuration as both public key and TOTP fall under something you have. So, this is not a multi factor authentication. Imagine a user storing both of these on a phone and loses the phone. This configuration can be enabled by doing the following

Add the following line to sshd config (/etc/ssh/sshd_config)

AuthenticationMethods publickey,keyboard-interactive

The configuration above requires user to pass public key check and keyboard-interactive check. keyboard-interactive check passes the control to pam module. SSH pam configuration file should have the following changes

# Require TOTP
auth required

Default pam ssh configuration requires a password. That can be disabled by removing/commenting out the line indicated below

# @include common-auth

2) PublicKey + Password

This is a good configuration since it mixes something you have and something you know. This can be enabled by adding the following to sshd config

AuthenticationMethods publickey,password


AuthenticationMethods publickey,keyboard-interactive

password authentication method is handled by sshd itself. While keyboard-interactive is handled by pam. So, in a default setting, they both appear to be the same to user but they are different under the surface. keyboard-interactive enables complex mechanisms possible via pam module. If the intention is use just password along with public key, its desirable to use “AuthenticationMethods publickey,password”

3) PublicKey + Password + TOTP

All 3 authentication methods listed here can be enabled by adding the following line to sshd config file

AuthenticationMethods publickey,keyboard-interactive

Further, pam ssh module should be modified to require totp code like mentioned in (c) above.

Running a process inside a network namespace

I have been reading and playing with Linux containers. I previously covered cgroups which enables process group level cpu/memory allocation. This blog entry is about my understanding of network namespace. And about running a process inside an isolated network namespace. Running a process there allows us to set specific filters on the specific process.

Network namespace allows Linux to clone the network stack and make the new stack available to a limited set of processes. This is used primarily with Linux containers such that each container has a different network stack altogether. There are multiple options for adding network interfaces to a newly created network namespace

Out of the three options above, I haven’t been able to find a lot about venet. I think venet is part of OpenVZ kernel changes and is not available in mainstream linux kernel.

A new network namspace can be created using the following command

ip netns add myspace

Now, we will create a new pair of type veth network interfaces. veth interfaces come in pair and act like a pipe of data. Each packet sent to veth0 shows up at veth1 and each packet to veth1 shows up at veth0.

ip link add veth0 type veth peer name veth1

Now, let’s move veth1 to our newly created namespace

ip link set veth1 netns myspace

Now, we will bring up veth0 (in original namespace) and assign IP address and subnet to it

ifconfig veth0 netmask up

Assigning ip address and netmask to veth0 (inside myspace namespace)

ip netns exec myspace ifconfig veth1 netmask up

The command above is important to look at again. The format of the above command is ip netns exec myspace <command>. The <command> executed here will be running in the myspace network namespace. The command will only see the interfaces, the route table configured inside the myspace network namespace.

Setting up gateway for myspace namespace

ip netns exec myspace route add default gw

Now, we have a namespace myspace which can send network packets to veth0 which will reach host at veth1 interface. Here, we can multiple options to connect veth1 to the outside world (via eth0).

  • Bridging
  • NAT

I will be covering NAT setup here. NAT can be enabled by running the following commands

# Enable kernel to forward packets from one interface to another (veth0 <-> eth0)
echo 1 > /proc/sys/net/ipv4/ip_forward
# Each packet coming via 192.168.42.* address space should be sent
# via eth0 after changing the source ip to eth0 ip.
iptables -t nat -A POSTROUTING -s -o eth0 -j MASQUERADE

Now, our network namespace is ready for use. A process can be run inside the network namespace by using the following

ip netns exec myspace <command>

The command will run inside the network namespace. If you want to run the command inside network namespace as an unprivileged user, use the following

# first sudo is optional and only needed if running this command as non-root
sudo ip netns exec myspace sudo -u <user> <command>

This way, all the traffic generated by the process will be inside the network namespace myspace. The traffic can be inspected and accepted or dropped in the parent namespace by using the filter table, forward chain.

Is your application Cloud Ready?

Cloud computing delivers a quick to market development platform which application teams can use to translate business requirements into applications faster than ever. However, the application needs to have certain features to be able to benefit from such a development platform.

  • Cloud enabled Architecture - The application should be able to distribute workload across multiple workers, be able to scale out by adding more resources, have no single point of failure, be able to recover from certain infrastructure failure scenarios like disk failure, datacenter failure, network degradation, database failure. This enables the application to be resilient and scale out when required.
  • Automated Configuration Management - The server configuration required to host the application should be automated to the last mile. This enables the infrastructure provider to patch the Operating System, move application across shared servers in a reliable fashion.
  • Automated deployment - The application should provide an automated deployment workflow for each release. Ideally, each deployment should have an automated rollback workflow defined too.
  • Support shutdown, start, moving - The application should be able to take in commands/signals/notifications when the application is required to shut down, start on the existing server or move to a new server. This enables the infrastructure provider to reduce servers, add servers, move servers when required.
  • Automated Build - The application should work with automated build tools and should be enabled to use automated build on check-in, nightly etc. This enables a quick turnaround for a developer to test their changes, makes the build process clean.
  • Automated QA - The application should have automated test cases which are run after every build. The regression results should be notified to developer making the changes so that either the code or the test cases can be fixed. This increases the speed of development while keeping quality high. It also avoids last minute QA surprises at the end of the sprint.
  • Support health checks - Health checks are important to make sure that the service is working as intended. Health checks can be at machine level, at load balancer, at cluster level and even at a datacenter level. Health checks are required to make decisions for self healing infrastructure.
  • Metrics & Monitoring - You can’t improve what you don’t measure. Whether its uptime, average response time, average failure rate, average pageload time, the metrics should be measured and observed consistently. Obsession with metrics will help us deliver better products with each sprint/release. Metrics should be monitored to observe any problems which would need either automated self healing or manual fixing.

There is an API for that

The API revolution is upon us. Many new age companies are embracing APIs and enabling ecosystem of apps/plugins/enhancement made by partners using the APIs. While there are benefits in exposing APIs to external customers, there are numerous benefits to exposing APIs to internal customers as well. Services Oriented Architecture (SOA) is a well understood design pattern. Some of the benefits of SOA design pattern include

  • Services can be measured. What can be measured can be improved. SLAs on uptime and response times can be defined which service must adhere to.
  • Services can be vendor independent. Services can hide the vendor complexity behind the scenes.
  • Services are reusable. Reusable services foster better integration between teams and products.

Ideally, (almost) every technical team should be delivering their product as a web service. There should be a service catalog where all services in the organization can be discovered. Each service should be accompanied with documentation, sample client code and maybe even some client libraries. The services should have well defined SLAs and the performance graphs should be transparent to the organization. Services should have throttling to protect themselves about getting abused by a single (internal) client. However, it should be noticed that there should not be too many services as well. Ideally, each service should be supported by at least 8-10 person team to ensure that there is business continuity in the team.

Going for a services oriented architecture also enables teams to automate their processes and tasks. Human based processes are error prone and susceptible to failure to follow steps correctly. Automation enables repeatable results and quick turnaround on requests. If each team were to expose their product as a web service, the clients can invoke the business flow at any time of the day without scheduling a meeting or filling a long form to get attention to another team.

Here are some rules to consider adapting when looking at Services Oriented Architecture across the organization

  • API is more important than UI. Each product to be procured should be required to have an API interface. UI matters but less than API.
  • API should support client authentication, client throttling.
  • APIs should be used for all technical teams including infrastructure.
  • Teams should publish SLAs for their APIs and ensure that their APIs meet or exceed their SLA

We should be working towards a goal, where for most of the tasks inside the organization, we can say “There is an API for that”.

Intro to websockets

Websocket is an application level protocol providing full duplex communication over a single TCP connection. Websocket is related to HTTP as it uses a HTTP/1.1 compliant handshake. The protocol was primarily designed to enable a continuous interaction between browsers and servers which is not natively supported in HTTP/1.0 or HTTP/1.1 specifications. HTTP/2.0 specification which is currently in progress will enable request/response pipelining and will natively solve the need which led to Websockets development.

To understand the need for websocket, one needs to understand the requirement first. The requirement primarily arises from a chatty client which wants to get regular updates from web server (like a stock ticker) and/or wants to regular messages to web server (like chat). HTTP request/response model enables clients to send messages to servers, but doesn’t make it easy for servers to send messages back to client. Comet solution was devised as a work around where client makes a call to the server and the server just holds the request till it has something to say or till a predefined time passes.

Websocket support is inbuilt into current browsers and that enables developers to write applications where data can be streamed easily between browser and web server in both directions. Given that websockets uses HTTP/1.1 upgrade header, they should work nicely with HTTP/1.1 compliant proxy servers. However, for greater success, one should use websockets over HTTPS, since that solution works with HTTP/1.0 proxy servers as well. Many client and server libaries allows application to fall back on Comet/Request-Response model in case websocket connection attempt is not successful or not supported due to a overzealous proxy server or unsupported browser.

Websockets specification enables ping/pong frames which can be used for keeping the underlying TCP connection alive, allow both text and binary frames which enables using websockets for both text (say json) and binary data transfer. Various load balancers like Haproxy, F5 have inbuilt support for Websockets which enables the application team to load balance websocket connections as well. However, AWS Elastic load balancer (ELB) doesn’t support Websockets as of today.

NoSQL Landscape

NoSQL like Cloud Computing is a jargon term with no specific meaning. The industry can’t agree upon whether NoSQL stands for “No SQL” or “Not only SQL”. SQL database traditionally have been great at consistency and availability but have performed poorly at partition tolerance. SQL solutions have traditionally been slow and nosql effort is essentially providing speed and scalability at the cost of consistency/referential integrity. CAP Theorem proved that no system can be consistent, available and partition tolerant at the same time. Most nosql solutions start with being partition tolerant and provide tunable consistency and availability depending on application needs. Further, many nosql solutions provides enough low latency that caching layer can be dropped off from the architecture providing lower cost of development and operations.

Since the popularity of NoSQL, SQL databases have also made effort to improve their latency. MySQL in version 5.6 release APIs to access InnoDB storage engine directly bypassing SQL interface which provides low latency reads and writes.

Here is an introduction to some of the popular NoSQL solutions in the market. The list below doesn’t capture all of them as there are way too many solutions coming up. Each solution provides different features and interface since there is no standardization on the nosql front.

Redis - Redis is a fast caching solution which can store data structures like string, list, hash, sorted set. It allows operations like set union, interesection etc which makes is win over traditional key-value caches. Redis doesn’t have cluster capability as of now (work in progress). Redis can be used on multiple nodes using proxy solution like Twemproxy. Redis allows taking snapshots of data in cache on disk. Redis has some (basic) disk persistence options as well which enables one to use Redis as a database as well.

Couchbase - Couchbase at a very high level is Membase (for caching) + CouchDB (for persistence). Couchbase has a pretty interesting architecture as it combines in-memory caching and database in a single product. Couchbase has 2 distinct modes - memcache compatible and couchbase mode. Memcache compatible doesn’t have all the features and should be avoided in favor of couchbase mode. Couchbase claims to provide consistent reads even in case of failure but the implementation falls short in practical tests. Couchbase with its cache+disk model gives very low latency. The solution can give high write throughput when write is appended to in-memory queue instead of pushed synchronously to disk. The solution ideally should scale well with number of nodes, but practical tests show performance degradation with high number of nodes. Couchbase has the potential to become a very strong NoSQL solution. But its implementation as of today falls behind the promises made by the design architecture.

Cassandra - Cassandra uses a append only architecture which is crash safe. Cassandra is optimized for writes and delivers a predictable write throughput. Cassandra scales very well horizontally. Cassandra read latency is not as low as Couchbase but is fast enough. Cassandra is eventually consistent database and provides tunable consistency for read requests.

DynamoDB - DynamoDB is database-as-a-service provided by AWS. DynamoDB is eventually consistent. Single key consistent reads are available at higher cost. DynamoDB is highly available (across multiple zones in a single region) and can be scaled up or down using AWS APIs. The database works at low latency with scalable throughput. Given its pricing and features, it gives the lowest total cost of ownership. Find more details about DynamoDB architecture in a research paper.

HBase - HBase is a strongly consistent database and is optimized for reads. HBase is part of Hadoop family and is ideal for storing large amount of sparse data. The solution is not as easy to administer as Cassandra or Couchbase. Its doesn’t work well when used for small amount of data or on small number of nodes.

MongoDB - MongoDB was the first popular NoSQL solution. The product became quite popular among early adopters as they compared the MongoDB solution to the SQL world they were migrating from. However, MongoDB is not the best NoSQL solution out in the market in terms of scalability, partitioning, server usage. If you want to follow the herd, choose this solution. If you can think for yourself, look at other solutions.

Here is a very basic dumbed down flowchart to help you choose your next database. Use this flowchart only to guide you to the best possible option. Database migration is costly. So read a lot more, test and benchmark before making the final selection.


[Click on image to see non compressed version]