This blog will show how
to set up a monitoring system for Java Applications using InfluxDB, Grafana,
Telegraph and StatsD-JVM-Profiler. This system will fetch memory usage, GC
activity and tomcat sessions from a java web application. Take a look at a
sample data visualization from Grafana:
This graph says a lot of about the java application. In my
context, it reads: the java application went through some intensive activities,
during this period, memory usage was very high, a lot of GC were triggered;
what was worse was: after the period of the intensive activity, the memory
usage was continuously building up to reach the point of OOM until GC kicked
off, this suggested some ill memory usage of the java application. This graph
coupled with the graph that shows tomcat activity sessions can tell a lot of about
your java application, how to read it will depend on your context and your
understanding of the application.
You can find other analysis example in Web request performance analysis charts.
You can find other analysis example in Web request performance analysis charts.
Architecture
The architecture of this system is as follows:
The java application is
instrumented with StatsD-JVM-Profiler (https://github.com/etsy/statsd-jvm-profiler),
this profiler will periodically fetch memory usage, GC activities from the
running JVM and send them to Telegraf.
Telegraf (https://www.influxdata.com/time-series-platform/telegraf/)
is a data collector, it has a lot of plugins, one of the input plugins is
StatsD. In this monitoring system, we use the StatsD plugin to accept data from
StatsD-JVM-Profiler and send it to InfluxDB. Another good thing about Telegraf
is that by default it will collect the CPU/Memory of the machine where it is
installed and send it to InfluxDB, so machine monitoring comes free.
StatsD-JVM-Profiler can
send data directly to InfluxDB, but I prefer it to go through Telegraf: it is
more robust and flexible, and Telegrah monitors the machine for free.
InfluxDB(https://www.influxdata.com/time-series-platform/influxdb/)
is a time-series DB, perfect for storing monitoring data.
Grafana(http://grafana.org/) is a UI application,
perfect for visualizing time-series data.
Note, in this architecture,
StatsD is not installed, the involvement of StatsD is on using Telegraf to send
data points written in StatsD format (which is very easy to understand and
use).
Installation
I install InfluxDB and Grafana on Ubuntu:
vagrant@exp:~$
cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=15.10
DISTRIB_CODENAME=wily
Install InfluxDB
sudo
apt-get update
sudo
apt-get upgrade
curl -sL
https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source
/etc/lsb-release
echo
"deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME}
stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo
apt-get update && sudo apt-get install influxdb
sudo
service influxdb start
By default, InfluxDB’s port is 8086.
Now InfluxDB is installed, you can create a database:
$influx
Connected
to http://localhost:8086 version 1.1.0
InfluxDB
shell version: 1.1.0
>
create database test
>
use test
To verify InfluxDB is in order:
curl
-i -X POST "http://localhost:8086/write?db=test" --data-binary
'user.logins,service=payroll,region=us-west value=1 1478177907371000064'
Go back to the InfluxDB command:
>
show measurements
name:
measurements
name
----
user.logins
>
select * from "user.logins"
name:
user.logins
time region service value
---- ------ ------- -----
1478177907371000064 us-west payroll 1
Install Grafana
$
wget
https://grafanarel.s3.amazonaws.com/builds/grafana_3.1.1-1470047149_amd64.deb
$
sudo apt-get install -y adduser libfontconfig
$
sudo dpkg -i grafana_3.1.1-1470047149_amd64.deb
$
sudo /bin/systemctl daemon-reload
$
sudo /bin/systemctl enable grafana-server
$ sudo
systemctl start grafana-server
The default port for grafana is 3000.
Install Telegraf
To install on
Ubuntu:
cd /tmp
wget
https://dl.influxdata.com/telegraf/releases/telegraf_1.0.1_amd64.deb
sudo dpkg -i telegraf_1.0.1_amd64.deb
It is likely
that you will install Telegraf on all sorts of operation systems, for the sake
of completeness, here are the commands to install it on Redhat and Windows.
To install
on Redhat:
cd /tmp
wget
https://dl.influxdata.com/telegraf/releases/telegraf-1.1.1.x86_64.rpm
sudo yum localinstall
telegraf-1.1.1.x86_64.rpm
To install
on Windows follow https://github.com/influxdata/telegraf/blob/master/docs/WINDOWS_SERVICE.md,
- Download and extract the windows distribution https://dl.influxdata.com/telegraf/releases/telegraf-1.1.2_windows_amd64.zi
- Create the directory C:\Program Files\Telegraf (if you install in a different location simply specify the -config parameter with the desired location)
- Place the telegraf.exe and the telegraf.conf config file into C:\Program Files\Telegraf
- To
install the service into the Windows Service Manager, run the following in
PowerShell as an administrator (If necessary, you can wrap any spaces in the
file paths in double quotes "") C:\"Program
Files"\Telegraf\telegraf.exe --service install .
If you install it in a different location, you need to specify the config file location with:
telegraf.exe --service install -config full_path_to_config_file
On Linux, the Telegrah configuration file is in /etc/telegraf/telegraf.conf.
Make the following changes to it:
[[outputs.influxdb]]
## The full HTTP or UDP endpoint URL for your InfluxDB instance.
## Multiple urls can be specified as part of
the same cluster,
## this means that only ONE of the urls will
be written to each interval.
# urls = ["udp://localhost:8089"] #
UDP endpoint example
urls =
["http://localhost:8086"] # required
## The target database for metrics (telegraf
will create it if not exists).
database = "test"
# required
…
[[inputs.statsd]]
[[inputs.statsd]]
## Address and port to host UDP listener on
service_address =
":8125"
## Delete gauges every interval
(default=false)
delete_gauges = true
## Delete counters every interval
(default=false)
delete_counters = true
## Delete sets every interval
(default=false)
delete_sets = true
## Delete timings & histograms every
interval (default=true)
delete_timings = true
## Percentiles to calculate for timing &
histogram stats
percentiles = [90]
## separator to use between elements of a statsd
metric
metric_separator = "_"
## Parses tags in the datadog statsd format
## http://docs.datadoghq.com/guides/dogstatsd/
parse_data_dog_tags = false
## Statsd data translation templates, more
info can be read here:
##
https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#graphite
##templates = [
##
"cpu.* measurement*"
##]
## Number of UDP messages allowed to queue
up, once filled,
## the statsd server will start dropping
packets
allowed_pending_messages = 10000
## Number of timing/histogram values to
track per-measurement in the
## calculation of percentiles. Raising this
limit increases the accuracy
## of percentiles but also increases the
memory usage and cpu time.
percentile_limit = 1000
After changing the configuration file, you need to restart
Telegraf:
sudo systemctl
restart telegraf
To test if Telegraf can successfully send data to InfluxDB:
echo
"user.logins,service=payroll,region=us-west:1|c" | nc -C -w 1 -u
localhost 8125
And now go back to the InfluxDB command line, you should be
able to see a new measurement “user_logins” has been created, notice the
Telegrah automatically convert dot to underscore, this is nice, as you do not have to
quote measurement names when select from them.
Instrument a Java application
Just follow the instructions on https://github.com/etsy/statsd-jvm-profiler.
In my case, I wanted to present related data series in one graph and use Grafana’s templating feature to select data series within one graph, for example, all data series of heap usage are shown in one graph (see the previous graph). I modified StatsD-JVM-Profiler to send data series using InfluxdDB tags
, and also I added a
profiler to collect active and expired sessions in tomcate – more on this in A Fork from statsd-jvm-profiler.
I read this article. I think You put a great deal of exertion to make this article. I like your work. essay writing service
ReplyDelete