PerfSpy: An ELK Docker cloud

My job involves analyzing massive log files from different customers. To improve my team’s efficiency, I created an ELK Docker cloud. Here is its architecture:

(We like to fantasize that we are superman that can solve any kind of customer issues, in reality, we are more like this: always in the water, and use our flesh to stem the tide :-) )

Back to the topic:

Moose: It is a Nodejs application that users can upload log files. Fun fact, elk and moose are different, and you should know their differences if you are a hunter: http://k99.com/moose-or-elk-how-to-tell-the-difference/.

· Moose stores users’ actions in a MongoDb.

DockerRunner is a python application, which uses Ansile to start Logstash containers to process users’ uploaded logs. Users can have different log analyzing algorithms for different log files, so for every user’s upload action, a new Logstash container will be started.

Users can view Logstash containers’ status in Moose and also can stop and remove Logstash containers.

Elasticsearch forms a cluster.

Users view logs through Kibana. Kibana has a customized plugin to enable automatically setting up discovery, visualize and dashboard (see my pervious blogs).

Here is an example of what you can see with Kibana:

All web applications are proxied through Nginx. (there is actually only one Nginx instance proxing all web applications)

Service registry, discovering and monitoring is done through Consul and Docker Registrator.

All operations are done through Ansible. Operations such as provisioning a new node, install/start/stop/upgrade applications.

Here is how these applications physically located:

VMs are managed through Vagrant

VM node information is defined in a boxes.yml file, so it is easy to start/stop nodes:

boxes:

os:

    ubuntu: wily64

    ubuntu_version: wily

    has_systemd: true

    proxy: http://web-proxy.houston.hp.com:8080

ctrl:

    name: ctrl

    ip: '192.168.33.10'

    memory: '2048'

elknodes:

    size: 3

    name: elk

    ip: '192.168.55.1'

    memory: '4096'



kibananodes:

    size: 1

    name: kibana

    ip: '192.168.66.1'

    memory: '4096'

ports:

    ctrl:

      - { guest: 8500, host: 8500 }

      - { guest: 80, host: 7500 }

      - { guest: 27017, host: 27017 }

      - { guest: 3000, host: 6500 }

The nodes’ name will take the form of name#number, for example, elknodes will be named elk1, elk2 and elk3.

Ansible does all the heavy lifting of provisioning nodes, installing/starting/upgrading applications.

A Vagrant provision python script will update an Ansible variable file so Ansible knows what nodes it is looking after.

The Ansible variable file also defines what each node is for:

master_node: 'ctrl'

registry_node: 'ctrl'

proxy_node: 'ctrl'

mongo_node: 'ctrl'

kibana_nodes: [kibana1,]

moose_nodes: [kibana1,]

elk_nodes: [elk1,elk2,elk3,]

logstash_nodes: [elk1,elk2,elk3,]

Docker_Runner runs Logstash containers. It distributes Logstash containers evenly onto VMs. This distribution can be done by Docker_Swarm, but in my experiment, I found out Docker_Swarm is not that stable, so I implemented the distribution logic in Docker_Runner.

One Elasticsearch is running on each node. ElasticSearch instances form a cluster. Clustering is done through a runtime parameter discovery.zen.ping.unicast.hosts. When Ansible starts an ElasticSearch container, it first check Consul the existing ElasticSearch containers and supply that information in the discovery.zen.ping.unicast.hosts paramete.

Because of the way Vagrant and Ansible are designed, starting a new VM and provision it is very simple:

Change the size of the nodes in boxes.yml, for example, change the size of elknodes to 4.

Start up the node

vagrant up elk4

Provision the node (install necessary software and start an ElasticSearch)

ansible-playbook apps.yml --tags=common,es --limit=elk4

PerfSpy

Tuesday, June 14, 2016

An ELK Docker cloud

No comments:

Post a Comment