Planet London Python

February 05, 2016

Sylvain Hellegouarch

dalamb – An AWS Lambda clone (sort of)

I was going through the AWS Lambda main presentation page yesterday and I started to wonder if it would be possible to create your own stack with the same featureset.

First off, AWS Lambda is a stunning programming model. A cross between stateless distributed functional programming at scale (say that fast ten times if you dare) coupled with zero-administration requirements. In other words, you write a function and AWS Lambda runs it at the appropriate scale for you.

What’s more is that, AWS Lambda can react to events following the event sourcing pattern. So you don’t keep any internal state, AWS Lambda executes your function with the event data.

Our job, as software developer, has just become almost too simple. The good news is that we can now focus on the value we are aiming for rather than the plumbing and machinery underneath. A whole new world.

With that said, AWS Lambda is brilliant but, well, is deeply tied to the AWS services and infrastructure. That’s sort of the point obviously, however, could we create a similar stack and run it ourselves? Because, why not!

Let’s therefore introduce “dalamb – An AWS Lambda clone (sort of)“.

dalamb properties and features

What would be dalamb properties? How could we design such a beast? Interestingly, we have most of the pieces at hand, at least from a technological and architecture perspective.

Let’s rewind a bit, AWS Lambda gives you the main following features:

  • extend AWS services with custom logic
  • perform funky operations triggered by AWS events
  • expose your function as a web service (insert mention to RESTful URL here)

All of this is backed by the powerful AWS infrastructure ensuring:

  • fault-tolerance
  • automatic scalability
  • a rich security model
  • pricing with precision

These features and properties tell us the stories dalamb should pursue:

  • The event-sourcing story: how to map function to events
    • The REST story: this one is just a specific case of the event sourcing story where the events are triggered by requests to a given endpoint
  • The delivery story: can we package up any function in any language?
  • The operational story: deploy, update, scale, keep available…

It’s all about events

The heart of the AWS Lambda platform is its event-sourcing feature. Events are everywhere, they actually define the system in itself.

As developers, what we build, when we forget about them, are fragile stateful monoliths that quickly fail under their own mass and incapacity to accept inevitable changes. Building discrete functions that react to events mean you reduce friction between changes and pressures on your system while, as a developer, keep a comprehensible view of the system.

Events have basic properties:

  • self-contained: the information enclosed within must not depend on moving external resources
  • immutable: events never change once they have been emitted
  • descriptive: events do not command, they describe facts
  • sequenced: event streams ordering must be supported
  • structured: so that rules and patterns can be matched against them

dalamb is not a re-write of the entire set of AWS services. It focuses on mapping discrete, simple, functions to events flowing through the system. For this reason, dalamb will be designed around event-sourcing patterns.

The goal will be to make it simple for external resources to send events and straightforward for functions to be matched to these events following simple rules.

dalamb will likely start using Kafka as a simple event-store backend.

Change is the nature of any live system

A live system is constantly changing due to internal and external stressors. Your functions are one of those stressors. In order for the system not to collapse onto itself, a certain level of automatic elasticity and draining are required.

Draining means that part of the system can be considered like wasteful and the system will find a way to clean them up. Elasticity means that the system must be able to absorb any load stress in a way that keeps the overall system functional.

AWS Lambda owns the responsibility of managing the system’s constant changing nature. So will dalamb. However, since dalamb will not own the infrastructure itself, it will be bound, initially at least, by the fixed size of the underlying infrastructure. However, within those boundaries, dalamb will ensure the system keeps its promises and stays live and well.

dalamb will likely take advantage of Mesos for resources sharing and Kubernetes or Marathon for manging the lifecycle of each function instances.

Geared towards developers

dalamb is meaningless if developers cannot express themselves though it. To achieve this, dalamb will not make any assumption regarding the actual code environment that is executed.

Indeed, in spite of being a powerful platform, AWS Lambda is currently limited by its selective targets. Oddly enough, if you are a Go or Ruby developer (for instance), you must call your code as a subprocess of the executed function.

dalamb takes the bet that the doors should be open to any developers with the same interface. To achieve this, dalamb will expect functions to be packaged up into container images, under the control of the developers. What dalamb should provide is the entry point definition for a container to be executed appropriately on the platform as a dalamb function.

Containers will be short-lived and their state will not be kept around for longer than the time required to run the function.

Developers will store their function on platforms like github or bitbucket and dalamb will simply react to events from those platforms to release to production new versions of a function.

What’s next?

dalamb is by no means properly defined or even designed. It is a story about developers achieving great values by relying on developer-oriented platforms. Whether it is Heroku or AWS Lambda, dealing with the complexity of this world has never been so accessible.

by Sylvain Hellegouarch at February 05, 2016 02:17 PM

February 04, 2016

Sylvain Hellegouarch

Customizing Kubernetes AWS deployment settings

Kubernetes provides a nice script to quickly deploy a small cluster on various infrastructure backends. On AWS, the documentation is a tad short about detailing how to customize the installation.

By default, Kubernetes is deployed on the Oregon availability zone (us-west-2a) with a cluster size of four minions driven by a single master. The EC2 instances for the minions and their master have a fairly limited size (t2.micro).

Kubernetes relies on Ubuntu Vivid by default and favors Docker as the container provider. The container storage backend is AUFS which is the common option on Ubuntu.

From a network perspective, Kubernetes uses the well-known 10.0.0.0/16 subnet for the services while keeping the 10.244.0.0/16 subnet for the cluster. Interestingly, it automatically adds the 10.0.0.0/8 subnet as an insecure registry so that you can more easily deploy Docker images hosted on a private registry.

Kubernetes relies on adding EC2 EBS volumes to extend the instances default storage. It will, in fact, use that storage to host container images. The default settings rely on a 32gb general purpose SSD volumes for the minions. The master receives smaller volumes but doesn’t require much in the first place.

Finally, the script enables, by default, node and cluster logging (through elasticsearch and kibana) as well as cluster monitoring (via influxDB). Once completed the deployment process, the script kindly tells you where to access those services.

All of these settings are kept inside the cluster/aws/config-default.sh resource found in the Kubernetes release. By overriding them before running the deployment script provided by the Kubernetes team, you can easily customize your Kubernetes cluster on AWS.

For instance, this is my development setup:

export KUBERNETES_PROVIDER=aws

# Working from Europe
export KUBE_AWS_ZONE=eu-west-1a
export AWS_S3_REGION=eu-west-1

# Using slightly more capable machines while keeping a tight leash over costs
export MINION_SIZE=t2.small
export NUM_MINIONS=3

I usually leave the other settings as-is because they suit my requirements. I encourage you to review the config-default.sh script to change other values.

Save your settings in a shell script sourced from your ~/.bashrc script and run the installation command given in the Kubernetes documentation.

by Sylvain Hellegouarch at February 04, 2016 08:38 PM

Running a zookeeper and kafka cluster with Kubernetes on AWS

I have been recently working with Russ Miles on coding microservices that follow principles he has laid out in the Antifragile Software book. As a simple demo we decided on using Kafka as a simple event store to implement an event sourcing pattern.

The main focus of this article is to talk about how to deploy a kafka cluster and manage its lifecycle via Docker containers and Kubernetes on AWS.

Initially, I wanted to quickly see how to get one instance of kafka to be available from outside the AWS world so that I could interact with it. My first move is always to look at the main Docker image repository for official or popular images. Interestingly, as of this writing, there is no official Kafka image. The most popular is wurstmeister/kafka which is what I decided to use.

However, this was not enough. Indeed, Kafka relies on Zookeeper to work. Spotify offers an image with both services in a single image, I don’t think that’s a good idea in production so I decided to forego it. For Zookeeper, I didn’t use the most popular because its documentation wasn’t indicating any possibility to pass on parameters to the service. Instead, I went with digitalwonderland/zookeeper which was supporting some basic parameters like setting the broker id.

Setting one instance of both services is rather straightforward and can be controlled by a simple Kubernetes replication controller like:

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: kafka-controller
spec:
  replicas: 1
  selector:
    app: kafka
  template:
    metadata:
      labels:
        app: kafka
    spec:
      containers:
      - name: kafka
        image: wurstmeister/kafka
        ports:
        - containerPort: 9092
        env:
        - name: KAFKA_ADVERTISED_HOST_NAME
          value: [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: zook:2181
      - name: zookeeper
        image: digitalwonderland/zookeeper
        ports:
        - containerPort: 2181

This could be exposed using the following Kubernetes service:

---
apiVersion: v1
kind: Service
metadata:
  name: zook
  labels:
    app: kafka
spec:
  ports:
  - port: 2181
    name: zookeeper-port
    targetPort: 2181
    protocol: TCP
  selector:
    app: kafka
---
apiVersion: v1
kind: Service
metadata:
  name: kafka-service
  labels:
    app: kafka
spec:
  ports:
  - port: 9092
    name: kafka-port
    targetPort: 9092
    protocol: TCP
  selector:
    app: kafka
  type: LoadBalancer

Notice the LoadBalancer type is used here because we need to create a AWS load-balancer to access those services from the outside world. Kubernetes scripts are clever enough to achieve this for us.

In the replication controller specification, we can see the requirement to let Kafka advertize its hostname. To make it work, this must be the actual domain of the AWS load-balancer. This means that you must create the Kubernetes service first (which is good policy anyway) and then, once it is done, write down its domain into the replication controller spec as the value of the KAFKA_ADVERTISED_HOST_NAME environment variable.

This is all good but this is not a cluster. It is merely a straight instance for development purpose. Even though Kubernetes promises you to look after your pods, it’s not a bad idea to run a cluster of both zookeeper and kafka services. This wasn’t as trivial as I expected.

 

The reason is mostly due to the way clusters are configured. Indeed, in Zookeeper’s case, each instance must be statically identified within the cluster. This means you cannot just increase the number of pod instances in the replication controller, because they would all have the same broker identifier. This will change in Zookeeper 3.5. Kafka doesn’t show this limitation anymore, indeed it will happily create a broker id for you if none is provided explicitely (though, this requires Kafka 0.9+).

What this means is that we now have two specifications. One for Zookeeper and one for Kafka.

Let’s start with the simpler one, kafka:

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: kafka-controller
spec:
  replicas: 1
  selector:
    app: kafka
  template:
    metadata:
      labels:
        app: kafka
    spec:
      containers:
      - name: kafka
        image: wurstmeister/kafka
        ports:
        - containerPort: 9092
        env:
        - name: KAFKA_ADVERTISED_PORT
          value: "9092"
        - name: KAFKA_ADVERTISED_HOST_NAME
          value: [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: zoo1:2181,zoo2:2181,zoo3:2181
        - name: KAFKA_CREATE_TOPICS
          value: mytopic:2:1

Nothing really odd here, we create a single kafka broker, connected our zookeeper cluster, which is defined below:

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: zookeeper-controller-1
spec:
  replicas: 1
  selector:
    app: zookeeper-1
  template:
    metadata:
      labels:
        app: zookeeper-1
    spec:
      containers:
      - name: zoo1
        image: digitalwonderland/zookeeper
        ports:
        - containerPort: 2181
        env:
        - name: ZOOKEEPER_ID
          value: "1"
        - name: ZOOKEEPER_SERVER_1
          value: zoo1
        - name: ZOOKEEPER_SERVER_2
          value: zoo2
        - name: ZOOKEEPER_SERVER_3
          value: zoo3
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: zookeeper-controller-2
spec:
  replicas: 1
  selector:
    app: zookeeper-2
  template:
    metadata:
      labels:
        app: zookeeper-2
    spec:
      containers:
      - name: zoo2
        image: digitalwonderland/zookeeper
        ports:
        - containerPort: 2181
        env:
        - name: ZOOKEEPER_ID
          value: "2"
        - name: ZOOKEEPER_SERVER_1
          value: zoo1
        - name: ZOOKEEPER_SERVER_2
          value: zoo2
        - name: ZOOKEEPER_SERVER_3
          value: zoo3
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: zookeeper-controller-3
spec:
  replicas: 1
  selector:
    app: zookeeper-3
  template:
    metadata:
      labels:
        app: zookeeper-3
    spec:
      containers:
      - name: zoo3
        image: digitalwonderland/zookeeper
        ports:
        - containerPort: 2181
        env:
        - name: ZOOKEEPER_ID
          value: "3"
        - name: ZOOKEEPER_SERVER_1
          value: zoo1
        - name: ZOOKEEPER_SERVER_2
          value: zoo2
        - name: ZOOKEEPER_SERVER_3
          value: zoo3

As you can see, unfortunately we cannot rely on a single replication controller with three instances as we do with kafka brokers. Instead, we run three distinct replication controllers, so that we can specify the zookeeper id of each instance, as well as the list of all brokers in the pool.

This is a bit of an annoyance because we therefore rely on three distinct services too:

---
apiVersion: v1
kind: Service
metadata:
  name: zoo1
  labels:
    app: zookeeper-1
spec:
  ports:
  - name: client
    port: 2181
    protocol: TCP
  - name: follower
    port: 2888
    protocol: TCP
  - name: leader
    port: 3888
    protocol: TCP
  selector:
    app: zookeeper-1
---
apiVersion: v1
kind: Service
metadata:
  name: zoo2
  labels:
    app: zookeeper-2
spec:
  ports:
  - name: client
    port: 2181
    protocol: TCP
  - name: follower
    port: 2888
    protocol: TCP
  - name: leader
    port: 3888
    protocol: TCP
  selector:
    app: zookeeper-2
---
apiVersion: v1
kind: Service
metadata:
  name: zoo3
  labels:
    app: zookeeper-3
spec:
  ports:
  - name: client
    port: 2181
    protocol: TCP
  - name: follower
    port: 2888
    protocol: TCP
  - name: leader
    port: 3888
    protocol: TCP
  selector:
    app: zookeeper-3

Doing so means traffic is routed accordingly to each zookeeper instance using their service name (internally to the kubernetes network that is).

Finally, we have our Kafka service:

---
apiVersion: v1
kind: Service
metadata:
  name: kafka-service
  labels:
    app: kafka
spec:
  ports:
  - port: 9092
    name: kafka-port
    targetPort: 9092
    protocol: TCP
  selector:
    app: kafka
  type: LoadBalancer

That one is simple because we only have a single kafka application to expose.

Now running these in order is the easy part. First, let’s start with the zookeeper cluster:

$ kubectl create -f zookeeper-services.yaml
$ kubectl create -f zookeeper-cluster.yaml

Once the cluster is up, you can check they are all happy bunnies via:

$ kubectl get pods
zookeeper-controller-1-reeww   1/1       Running   0          2h
zookeeper-controller-2-t4zzx   1/1       Running   0          2h
zookeeper-controller-3-e4zmo   1/1       Running   0          2h

$ kubectl logs zookeeper-controller-1-reeww
...

One of them should be LEADING, the other two ought to be FOLLOWERS.

Now, you can start your Kafa broker, first its service:

$ kubectl create -f kafka-service.yaml

On AWS, you will need to wait for the actual EC2 load balancer to be created. Once that’s done, take its DNS name and edit the kafka-cluster.yaml spec to set it to KAFKA_ADVERTISED_HOST_NAME.

Obviously, if you have setup a DNS route to your load-balancer, simply use that domain instead of the load-balancer’s. In that case, you can set its value once for good in the spec.

Then, run the following command:

$ kubectl create -f kafka-cluster.yaml

This will start the broker and automatically create the topic “mytopic” with one replica and two partitions.

At this stage, you should be able to connect to the broker and produce and consume messages. You might wantt o try kafkacat as a simple tool to play with your broker.

For instance, listing the topic on your broker:

$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -L

You can also produce messages:

$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -P -t mytopic

Consuming messages is as simple as:

$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -C -t mytopic
...

At this stage, we still don’t have a kafka cluster. One might expect that running something like this should be enough:

$ kube scale --replicas=3 rc/kafka-controller

But unfortunately, this will only create new kafka instances, it will not automatically start replicating data to the new brokers. This has to be done out of band as I will explain in a follow-up article.

As a conclusion, I would say that using existing images is not ideal because they don’t always provide the level of integration you’d hope for. What I would rather do is build specific images that initially converse with an external configuration server to retrieve some information they need to run. This would likely make things a little more smooth. Though in the case of zookeeper, I am looking forward for its next release to support dynamic cluster scaling.

by Sylvain Hellegouarch at February 04, 2016 02:00 PM

February 03, 2016

Peter Bengtsson

February 01, 2016

Peter Bengtsson

Bestest and securest way to handle Python dependencies

pip 8 is out and with it, the ability to only install dependencies you've vetted. Thank Erik Rose! Now you can be absolutely certain that dependencies you downloaded and installed locally is absolutely identical to the dependencies you download and install in your production server.

First pipstrap.py

So your server needs pip to install those dependencies safely and securely. Initially you have to trust the pip/virtualenv that is installed globally on the system. If you can trust it but unsure it's a good version of pip version 8 and up, that's where pipstrap.py comes in. It makes sure you get a pip version installed that supports pip install with hashes:

Add pipstrap.py to your git/hg repo and use it to make sure you have a good pip. For example your deployment script might look like this now:

#!/bin/bash
git pull origin master
virtualenv venv
source venv/bin/activate
python ./tools/pipstrap.py
pip install --require-hashes -r requirements.txt

Then hashin

Thanks to pipstrap we now have a version of pip that really does check the hashes you've put in the requirements.txt file.

(By the way, the --require-hashes on pip install is optional. pip will imply it if the requirements.txt file appears to have hashes defined. But to avoid the risk and you accidentally fumbling a bad requirements.txt it's good to specify --require-hases to pip install)

Now that you're up and running and you sleep well at night because you know your production server has exactly the same dependencies you had when you did the development and unit testing, how do you get the hashes in there?

The tricks is to install hashin. (pip install hashin). It helps you write those hashes.

Suppose you have a requirements.txt file that looks like this:

Django==1.9.1
bgg==0.22.1
html2text==2016.1.8

You can try to run pip install --require-hashes -r requirements.txt and learn from the errors. E.g.:

Hashes are required in --require-hashes mode, but they are missing from some requirements. 
Here is a list of those requirements along with the hashes their downloaded archives actually 
had. Add lines like these to your requirements files to prevent tampering. (If you did not 
enable --require-hashes manually, note that it turns on automatically when any package has a hash.)
    Django==1.9.1 --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101
    bgg==0.22.1 --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d
    html2text==2016.1.8 --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

But those are just the hashes for your particular environment (and your particular support for Python wheels). Instead, take each requirement and run it through hashin

$ hashin Django==1.9.1
$ hashin bgg==0.22.1
$ hashin html2text==2016.1.8

Now your requirements.txt will look like this:

Django==1.9.1 \
    --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101 \
    --hash=sha256:a29aac46a686cade6da87ce7e7287d5d53cddabc41d777c6230a583c36244a18
bgg==0.22.1 \
    --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d \
    --hash=sha256:aaa53aea1cecb8a6e1288d6bfe52a51408a264a97d5c865c38b34ae16c9bff88
html2text==2016.1.8 \
    --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

One Last Note

pip is smart enough to traverse the nested dependencies of packages that need to be installed. For example, suppose you do:

$ hashin premailer

It will only add...

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23

...to your requirements.txt. But this package has a bunch of dependencies of its own. To find out what those are, let pip "fail for you".

$ pip install --require-hashes -r requirements.txt
Collecting premailer==2.9.7 (from -r r.txt (line 1))
  Downloading premailer-2.9.7-py2.py3-none-any.whl
Collecting lxml (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssutils (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssselect (from premailer==2.9.7->-r r.txt (line 1))
In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    lxml from https://pypi.python.org/packages/source/l/lxml/lxml-3.5.0.tar.gz#md5=9f0c5f1eb43ff44d5455dab4b4efbe73 (from premailer==2.9.7->-r r.txt (line 1))
    cssutils from https://pypi.python.org/packages/2.7/c/cssutils/cssutils-1.0.1-py2-none-any.whl#md5=b173f51f1b87bcdc5e5e20fd39530cdc (from premailer==2.9.7->-r r.txt (line 1))
    cssselect from https://pypi.python.org/packages/source/c/cssselect/cssselect-0.9.1.tar.gz#md5=c74f45966277dc7a0f768b9b0f3522ac (from premailer==2.9.7->-r r.txt (line 1))

So apparently you need to hashin those three dependencies:

$ hashin lxml
$ hashin cssutils
$ hashin cssselect

Now your requirements.txt file will look something like this:

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23
lxml==3.5.0 \
    --hash=sha256:349f93e3a4b09cc59418854ab8013d027d246757c51744bf20069bc89016f578 \
    --hash=sha256:8628cc82957c41be10abce889a1976ceb7b9e3f36ebffa4fcb1a80901bf77adc \
    --hash=sha256:1c9c26bb6c31c3d5b3c104e843211d9c105db60b4df6770ac42673263d55d494 \
    --hash=sha256:01e54511034333f18772c335ec0b33a76bba988135eaf727a075897866d19604 \
    --hash=sha256:2abf6cac9b7952047d8b7265384a9565e419a727dba675e83e4b7f5b7892b6bb \
    --hash=sha256:6dff909020d0c030fb26004626c8f87f9116e0381702fed415caf94f5a9b9493
cssutils==1.0.1 \
    --hash=sha256:78ac48006ac2336b9456e88a75ed35f6a31a030c65162503b7af01a60d78db5a \
    --hash=sha256:d8a18b2848ea1011750231f1dd64fe9053dbec1be0b37563c582561e7a529063
cssselect==0.9.1 \
    --hash=sha256:0535a7e27014874b27ae3a4d33e8749e345bdfa62766195208b7996bf1100682

Ah... Now you feel confident.

Actually, One More Last Note

Sorry for repeating the obvious but it's so important it's worth making it loud and clear:

Use the same pip install procedure and requirements.txt file everywhere

I.e. Install the depdendencies the same way on your laptop, your continuous integration server, your staging server and production server. That really makes sure you're running the same process and the same dependencies everywhere.

February 01, 2016 04:34 PM