PerfSpy: Custom Kibana

A few months ago, I discovered Kibana, back then, Kibana was still on 3.0 and lacked aggregation histogram, which I desperately needed, so I forked Kibana, and made lots of changes to it. Now Kibana 4 beta is out, and it looks like it rendered many of my changes unnecessary. Nevertheless, my work on Kibana has been enormously helpful in my analysis of logs. The fork is https://github.com/penguincp/kibana, branch 3.0agg.

Human brains are marvelous, but they are not good at dealing with massive unorganized data. I use Logstash to organize loggings, send them to ElasticSearch, use Kibana to query and show loggings and sometimes export them to CSV for further analysis.

I call a logging an event, which can be divided into subevent. For example, if the logging is “fail to hear from k1 for over 3 minutes”, it is a “cluster failed” event with the subevent “no heard from k1”.

Aggregation histogram

The change is mainly done on histogram app/panels/histogram/module.js, which was used to display facet histogram. In a facet histogram, you won’t be able to see events stacked together in the same column.

With aggregation histogram, events are stacked in the same column. Below is a chart showing events “OOM” and “restart”:

By default, aggregation histogram chart is aggregated on the field “event”, you can change its default setting by clicking on the “configure” button at the right corner of the chart. An event can be divided into subevents, so you can further do aggregation histogram on subevents:

The following chart is a subevent aggregation histogram on event “cluster failed”:

If you’d like, you can divide subevents further, but I do not think it is meaningful in a chart. If you want to do more detailed analysis, Excel is an old and reliable friend. That is why I also implemented “export to CSV” function.

Export chart data and raw data to CSV

There are two options to export to CSV: exporting result data (chart data) or export raw data.

The result data is the data that makes up the chart, so you will get a tablet such as:

time	count	event
2014-09-15T20:00:00	2	restart
2014-09-22T08:00:00	4	restart
2014-09-23T20:00:00	4	restart
2014-09-25T20:00:00	10	OOM
2014-09-29T08:00:00	32	OOM
2014-09-29T20:00:00	100	OOM

This kind of export is pretty useless to me for analyzing logs, so I usually choose to download raw data. With this option, you can choose which field to export:

The fields are defined in app/dashboards/logstash.json:

"download":{
   "mode":"Result",
"fields":[
      {"name":"timestamp", "selected":true},
      {"name":"event", "selected":true},
      {"name":"subevent", "selected":true},
      {"name":"message", "selected":true},
      {"name":"msg", "selected":true},
      {"name":"exClass", "selected":true},
      {"name":"ex", "selected":true},
      {"name":"exMethod", "selected":true},
      {"name":"exLine", "selected":true},
      {"name":"logClass", "selected":true},
      {"name":"logClassLine", "selected":true},
      {"name":"level", "selected":true},
      {"name":"node", "selected":true},
      {"name":"path", "selected":false},
      {"name":"customer", "selected":false},
      {"name":"@timestamp", "selected":false}
]
}

TODO

Right now, the raw data is first retrieved from ElasticSearch and then saved to a CSV file. When there is a lot of data, export will fail. Retrieving and saving data should be done in a streaming fashion.

Hierarchical indices

I need to analyze logs from different customers, each customer deploys the application in a cluster. So the logs have two levels: customer and cluster node. Kibana used to only support indices with format logstash-YYYY.MM.DD, I changed it to support indices with format logstash-customer-node-YYYY.MM.DD. Indices can be selected through menu custom (not a very good place for this feature, but I do not want to change Kibana too much):

Charts of nodes have axis that have the same time range, so you can easily compare what happens at the same time for each node.

TODO

When you select node, you do not have to move your mouse into the checkbox, you can click on any place that is on the same row as the checkbox, and the checkbox will be checked or unchecked, which is confusing, as an innocent click on the blank will produce charts that are not desired. (this is finished.)

Timestamps are based on UTC timezone, which is very inconvenient if log files were written in a different timezone, which is typically true for log files that I have to analyze. Loggings should be written to ElasticSearch and shown up in Kibana using the timezone they were written with. (this is finished, please check out What time is it? -- A cheap solution to make Kibana show histogram in logged timezone)

You can select multiple nodes, and choose to show them in how many rows. You can also change your node selection in panel configure:

PerfSpy

Wednesday, November 5, 2014

Custom Kibana

Aggregation histogram

Export chart data and raw data to CSV

TODO

Hierarchical indices

TODO

No comments:

Post a Comment