Back-end abstraction

The container back-end Zoe uses is configurable at runtime. Internally there is an API that Zoe, in particular the scheduler, uses to communicate with the container back-end. This document explains the API, so that new back-ends can be created and maintained.

Zoe assumes back-ends are composed of multiple nodes. In case the back-end is not clustered or does not expose per-node information, it can be implemented in Zoe as exposing a single big node. In this case, however, many of the smart scheduling features of Zoe will be unavailable.

Package structure

Back-ends are written in Python and live in the zoe_master/backends/ directory. Inside there is one Python package for each backend implementation.

To let Zoe use a new back-end, its class must be imported in zoe_master/backends/ and the _get_backend() function should be modified accordingly. Then the choices in zoe_lib/ for the configuration file should be expanded to include the new back-end class name.

More options to the configuration file can be added to support the new backend. Use the --<backend name>-<option name> convention for them. If the new options do not fit the zoe.conf format, a separate configuration file can be used, like in the DockerEngine and Kubernetes cases.


Whenever Zoe needs to access the container back-end it will create a new instance of the back-end class. The class must be a child of zoe_master.backends.base.BaseBackend. The class is not used as a singleton and may be instantiated concurrently, multiple times and in different threads.

class zoe_master.backends.base.BaseBackend(conf)

The base class that all back-ends should implement.


Initializes the backend. In general this includes finding the current API endpoint and opening a connection to it, negotiate the API version, etc. Here backend-related threads can be started, too. This method will be called only once at Zoe startup.


List the images available on the specified node.

node_list() → List[str]

List node names configured in the back-end.

platform_state() → zoe_master.stats.ClusterStats

Get the platform state. This method should fill-in a new ClusterStats object at each call, with fresh statistics on the available nodes and resource availability. This information will be used for taking scheduling decisions.

preload_image(image_name: str) → None

Make a service image available.


Performs a clean shutdown of the resources used by Swarm backend. Any threads that where started in the init() method should be terminated here. This method will be called when Zoe shuts down.

spawn_service(service_instance: zoe_master.backends.service_instance.ServiceInstance)

Create a container for a service.

The backend translates all the configuration parameters given in the ServiceInstance object into backend-specific container options and starts the container.

This function should either:

  • raise ZoeStartExecutionRetryException in case a temporary error is generated
  • raise ZoeStartExecutionFatalException in case a fatal error is generated
  • return a tuple with three elements: backend-specific ID that will be used later by Zoe to interact with the running container, the externally-reachable ip address for the container and the port mapping
terminate_service(service: zoe_lib.state.service.Service) → None

Terminate the container corresponding to a service.

update_service(service, cores=None, memory=None)

Update a service reservation.