Installing Zoe

Multiple deployment options are available:

  • Demo install via Docker Compose
  • Deployment scripts
  • Manual install

Please refer to zoe-deploy repository for deployment scripts.

The section below describes how to install Zoe manually.

Manual install

Zoe components:

  • Master
  • API
  • command-line client

Zoe is written in Python and uses the requirements.txt file to list the package dependencies needed for all components of Zoe. Not all of them are needed in all cases, for example you need the kazoo library only if you use Zookeeper to manage Swarm high availability.

Optional components:

Overview

ZApps, usually, expose a number of interfaces (web, REST and others) to the user. Docker Swarm does not provide an easy way to manage this situation: the port can be statically allocated, but the IP address is chosen arbitrarily by Swarm and there is no discovery mechanism (DNS) exposed to the outside of Swarm.

In the interest of keeping dependencies few and easy to manage, we do not rely on external plugins for networking of volumes. With the functionality that is built-in into Docker and Swarm there is no good, automated, way to solve the problem of accessing services running inside an overlay network from outside. We decided to leave the network configuration entirely in the hands of who is in charge of doing the deployment: Zoe expects a Docker network name and will connect all containers on that network. How that network is configured is outside Zoe’s competence area.

As an example of a simple, robust configuration, we use a standard Swarm configuration, with private and closed overlay networks. We create one overlay network for use by Zoe and spawn two containers attached to it: one is a SOCKS proxy and the other is an SSH gateway. Thanks to LDAP users can use the SSH gateway to create tunnels and copy files from/to their workspace. These gateway containers are maintained outside of Zoe, at this Github repository: https://github.com/DistributedSystemsGroup/gateway-containers

Zoe requires a shared filesystem, visible from all Docker hosts. Each user has a workspace directory visible from all its running ZApps. The workspace is used to save Jupyter notebooks, copy data from/to HDFS, provide binaries to MPI and Spark applications. Again, there are several plugins for Docker that offer a variety of volume backends: we have chosen the simplest deployment option, by using a shared filesystem mounted on all the hosts to provide workspaces.

Requirements

  • Python 3. Development happens on Python 3.4, but we test also for Python 3.5 on Travis-CI.
  • Docker Swarm (we have not yet tested the new distributed swarm-in-docker available in Docker 1.12)
  • A shared filesystem, mounted on all hosts part of the Swarm. Internally we use CEPH-FS, but NFS is also a valid solution.

Optional:

  • A Docker registry containing Zoe images for faster container startup times
  • A logging pipeline able to receive GELF-formatted logs, or a Kafka broker

Swarm/Docker

Install Docker and the Swarm container:

Network configuration

Docker 1.9/Swarm 1.0 multi-host networking can be used in Zoe:

This means that you will also need a key-value store supported by Docker. We use Zookeeper, it is available in Debian and Ubuntu without the need for external package repositories and is very easy to set up.

Images: Docker Hub Vs local Docker registry

A few sample ZApps have their images available on the Docker Hub. We strongly suggest setting up a private registry, containing your customized Zoe Service images.

Zoe

Currently this is the recommended procedure, once the initial Swarm setup has been done:

  1. Clone the zoe repository
  2. Install Python package dependencies: pip3 install -r requirements.txt
  3. Create new configuration files for the master and the api processes (Zoe configuration), you will need also access to a postgres database
  4. Setup supervisor to manage Zoe processes: in the scripts/supervisor/ directory you can find the configuration file for supervisor. You need to modify the paths to point to where you cloned Zoe and the user (Zoe does not need special privileges).
  5. Start running ZApps!

In case of troubles, check the logs for errors. Zoe basic functionality can be tested via the zoe.py stats command. It will query the zoe-api process, that in turn will query the zoe-master process.

API Manager

To provide TLS termination, authentication, load balancing, metrics, and other services to the Zoe API, you can use an API manager in front of the Zoe API. For example: