Monday, August 8, 2016

ElasticSearch Cluster with Docker Swarm 1.12



A couple months ago, I created my own docker clustering solution (An ELK Cloud), it is a mixture of Consule+Registrator+Docker Swarm+Docker network+a home made scheduling system, it has been working fine, but since Docker 1.12 holds a lot of promises, naturally I want to try it out. 

I have ansible playbooks ready, so it is a piece of cake to create a few virtual boxes and install Docker on them. 3 virtual boxes are created, on ctrl box, ansible+docker-enginer are installed, on elk1 and elk2 boxes, docker-engine is installed. 

The virtual box version is: 
vagrant@ctrl:/elkcloud-swarm$ cat /etc/*-release
PRETTY_NAME="Ubuntu 15.10"
VERSION_ID="15.10"
 

The installed Docker-Engine version is:
vagrant@ctrl:/elkcloud-swarm/ansible$ docker -v
Docker version 1.12.0, build 8eab29e

Setup the swarm cluster

Compare with Docker 1.1, this is now become very easy. 

On node ctrl run:
vagrant@ctrl:/elkcloud-swarm/ansible$ docker swarm init --advertise-addr 192.158.33.10:2377
Swarm initialized: current node (4exm9dm4bvqt86e4mhm4moyoa) is now a manager.

To add a worker to this swarm, run the following command:
    docker swarm join \
    --token SWMTKN-1-31dfdzlx5kzfvc63hxil46mgzzptd3gdjmp5xize1ioktagcq6-0l661rz7lj2qbu85gi4e774qi \
    192.158.33.10:2377

On node elk1 and elk2, run:
    docker swarm join \
    --token SWMTKN-1-31dfdzlx5kzfvc63hxil46mgzzptd3gdjmp5xize1ioktagcq6-0l661rz7lj2qbu85gi4e774qi \
    192.158.33.10:2377

And now we have a swarm cluster. 

On node ctrl, we can find out the nodes in the cluster:

$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
0lhrycydsg3xagi73akd5fiti    elk1      Ready   Active       
4exm9dm4bvqt86e4mhm4moyoa *  ctrl      Ready   Active        Leader
8ve54wd54ika1cy08xz3qrdbl    elk2      Ready   Active      

Create a scalable service

Run the following command on node ctrl, which will create 3 instances of elasticsearch.

$ docker service create --replicas 3 -p 9200:9200 --name es elasticsearch
0o4l67kfwtw7etucv17ehu524

And we can check the status of the service using the following command:

$ docker service ps es
ID                         NAME  IMAGE          NODE  DESIRED STATE  CURRENT STATE            ERROR
3jftwf1vbmo48l5bd1hlzk44x  es.1  elasticsearch  elk1  Running        Preparing 2 seconds ago 
5q8mmmjpmexo82uktp1884d4l  es.2  elasticsearch  ctrl  Running        Preparing 2 seconds ago 
6tpokr3bskdf3zzjoizdfbfuo  es.3  elasticsearch  elk2  Running        Preparing 2 seconds ago

As expected, each node will run one instance of elasticsearch. 

We can scale up the es service:
$ docker service scale es=5
es scaled to 5

Now es service is distributed as:

$ docker service ps es
ID                         NAME  IMAGE          NODE  DESIRED STATE  CURRENT STATE               ERROR
3jftwf1vbmo48l5bd1hlzk44x  es.1  elasticsearch  elk1  Running        Running about a minute ago 
5q8mmmjpmexo82uktp1884d4l  es.2  elasticsearch  ctrl  Running        Running about a minute ago 
6tpokr3bskdf3zzjoizdfbfuo  es.3  elasticsearch  elk2  Running        Running about a minute ago 
eypcp3bfsk3wxymar9tvusrsd  es.4  elasticsearch  elk1  Running        Running 39 seconds ago     
08p9f29ullgkby940rivojl0n  es.5  elasticsearch  elk2  Running        Running 39 seconds ago   

If I shutdown some elasticsearch instances, additional elasticsearch instances will be started automatically. After I shut down an elasticsearch instance in elk1, watch what happens:

$ docker service ps es
ID                         NAME      IMAGE          NODE  DESIRED STATE  CURRENT STATE           ERROR
e2xaruq2dyq895ynd1pqpj0gi  es.1      elasticsearch  ctrl  Running        Running 5 minutes ago  
3gntph338pxa5spe3ni9tczs6  es.2      elasticsearch  elk2  Running        Running 5 minutes ago  
8vvlk413jnfq9oibv4mz6lrml  es.3      elasticsearch  elk1  Running        Running 5 minutes ago  
b4c09yn2fui75en5q4r1tl8ko  es.4      elasticsearch  elk2  Running        Running 38 seconds ago 
7xmjoqmueze6vfl0caiiz4ox6   \_ es.4  elasticsearch  elk1  Shutdown       Failed 43 seconds ago   "task: non-zero exit (137)"
77aqxc5b12j43ibkfwjzvptkb  es.5      elasticsearch  ctrl  Running        Running 4 minutes ago  

A compensating elasticsearch instance is now created in elk2 (although for load balancing, I’d hope that it would be created in elk1).

Cool!

Not so quick, there are troubles in paradise!

Troubles in paradise

es is not clustered

elasticsearch instances are created, but they do not form a cluster, and that is not how elasticsearch is supposed to work. 

$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

In my current system, I create a network for elasticsearch, register elasticsearch instances to consul, and when I start a new elastisearch instance, I get all active elasticsearch instances from consul, and supply the information in the parameter -E discovery.zen.ping.unicast.hosts

I suppose this issue is not hard to solve, I could write a plugin for elasticsearch, which queries Docker for active elasticsearch instances, and connects them.

Data storage

Unlike the common analogy of “cattle vs pet”, elasticsearch is not really a cattle it has data, lots of data. (Also this is a cruel analogy, why we think cattle can be sacrificed and replaced at will J) we need to preserve the data. 

Checking an elasticsearch container created above, you will find its data is stored in:
$ docker inspect es.1.e2xaruq2dyq895ynd1pqpj0gi
"Mounts": [
            {
                "Name": "87f2fcd0fb35a47e0b7138194a640d660e2b8d933ece7a4c9c7d9e984c747a40",
                "Source": "/var/lib/docker/volumes/87f2fcd0fb35a47e0b7138194a640d660e2b8d933ece7a4c9c7d9e984c747a40/_data",
                "Destination": "/usr/share/elasticsearch/data",

If I remove the container, the data will be removed to. That is not right. 

Docker 1.12 also has improved volume, let us try it out. 

$ docker volume create -d local --name hello
$ docker volume inspect hello
[
    {
        "Name": "hello",
        "Driver": "local",
        "Mountpoint": "/var/lib/docker/volumes/hello/_data",
        "Labels": {},
        "Scope": "local"
    }
]

Local volume will be created under /var/lib/docker/volumes/, to create local volumes elsewhere, we can use the plugin in https://github.com/CWSpear/local-persist

Install the plugin (do this for all nodes):

$ curl -fsSL https://raw.githubusercontent.com/CWSpear/local-persist/master/scripts/install.sh | sudo bash

Create a volume (do this for all nodes):

$ docker volume create -d local-persist -o mountpoint=/elkcloud-swarm/esdata --name es

Now recreate es which binds to the volume:

$ docker service create --replicas 3 --name es -p 9200:9200 --mount type=bind,source=es,target=/usr/share/elasticsearch/data elasticsearch

Now you shall see elasticsearch containers are writing to the local volume:

$ ls /elkcloud-swarm/esdata/elasticsearch/nodes/0
node.lock  _state

But there are still lingering problems. I do not know on which node docker-swarm will distribute containers, as we see before, it is possible docker-swarm will create two elasticsearch instances in the same node, and they will write to the same folder.

Overlay network doesn’t work

There are several defects about this: https://github.com/docker/swarm/issues/2181, https://github.com/docker/docker/issues/23855. Basically, I can’t connect to service instances running on a different node. I can connect to instances running on the same node.

$ docker network create -d overlay mynetwork

$ docker service create  --replicas 6 --network mynetwork --name test -p 80:80 tutum/hello-world

$ docker ps -a
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS              PORTS               NAMES
c51a4b0709ee        tutum/hello-world:latest   "/bin/sh -c 'php-fpm "   27 seconds ago      Up 25 seconds       80/tcp              test.4.0akfkaltxubmatvyo6rj4hqn6
3fcf2bc0fa28        tutum/hello-world:latest   "/bin/sh -c 'php-fpm "   27 seconds ago      Up 25 seconds       80/tcp              test.5.689ei9omdraj57uwmxl4d5f18

Connecting to the instances on the same node works find:

$ docker exec -it test.4.0akfkaltxubmatvyo6rj4hqn6 ping test.5.689ei9omdraj57uwmxl4d5f18

PING test.5.689ei9omdraj57uwmxl4d5f18 (10.255.0.8): 56 data bytes
64 bytes from 10.255.0.8: seq=0 ttl=64 time=1.477 ms

Connecting to instances on a different node fails:

$ docker exec -it test.4.0akfkaltxubmatvyo6rj4hqn6 ping test.3.6fsrn76zstvuv1r6es2px0tks

ping: bad address 'test.3.6fsrn76zstvuv1r6es2px0tks'

Docker swarm also automatically load-balances the service instances (but only on the same node):

$ curl localhost
<h3>My hostname is c51a4b0709ee</h3>             
$ curl localhost

<h3>My hostname is 3fcf2bc0fa28</h3>

So for now, I am going to live with my Docker 1.11 solution – it is not easy to setup, but with the help of Ansible, once it is automated, it doesn’t bother anyway.

4 comments: