How to connect to influxdb

vagrant@exp:~$ influx

Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.

Connected to http://localhost:8086 version 1.1.0

InfluxDB shell version: 1.1.0

How to check DB objects?

> show databases

name

----

_internal

pcoe

test

> use test

Using database influxvm

> show measurements

name

----

cpu

mem

user.logins

How to check tags?

A Tag in InfluxDB is like an indexed field in a relational table. You typically make a field as a tag because you need to use it as a condition in the where clause to improve query performance.

If the name of a measurement contains dot, you need to quote the name. There seems to be a lot of issues with dot in measurement or field names, so avoid having dot in the names.

> show tag keys from "user.logins"

tagKey

------

region

service

> show tag values from "user.logins" with key=region

key value

--- -----

region us-west

How to use tag in the where clause?

Influxdb support regex matching.

> select * from user_logins where region =~ /us.*/

time region service value

---- ------ ------- -----

1478177907371000064 us-west payroll 1

Regex matching can also be used in other places:

> SHOW MEASUREMENTS WITH MEASUREMENT =~ /.*login.*/

name

----

user_logins

How to define a retention policy?

The default retention policy is “autogent”, which is to keep data forever. The syntax for creating a retention policy is:

CREATE RETENTION POLICY <retention_policy_name> ON <database_name> DURATION <duration> REPLICATION <n> [SHARD DURATION <duration>] [DEFAULT]

To keep data for 5 days and purge it automatically after 5 days, use:

CREATE RETENTION POLICY fivedays ON test DURATION 5d REPLICATION 1

To insert data using the default retention policy:

> insert user_logins,service=payroll,region=use-west value=1

To insert data using a specific retention policy:

>insert into fivedays user_logins,service=payroll,region=use-east value=5

To select the data point just inserted, you need to prefix the measurement name with the retention policy name:

> select * from user_logins

time region service value

---- ------ ------- -----

1484127510987890238 use-west payroll 1

> select * from fivedays.user_logins

time region service value

---- ------ ------- -----

1484127751197523399 use-east payroll 5

To make a retention policy default, create it with “default” clause:

> drop retention policy fivedays on test

(data rentented with this policy will be dropped too)

> CREATE RETENTION POLICY fivedays ON test DURATION 5d REPLICATION 1 default

Or use alter command:

alter retention policy fivedays on test default

show shards command will show retention policies on each database:

id database retention_policy shard_group start_time end_time expiry_time owners

-- -------- ---------------- ----------- ---------- -------- ----------- ------

125 test autogen 125 2016-09-19T00:00:00Z 2016-09-26T00:00:00Z 2016-09-26T00:00:00Z

695 test fivedays 695 2017-01-11T00:00:00Z 2017-01-12T00:00:00Z 2017-01-17T00:00:00Z

How to insert a data point with a specific timestamp?

In the previous example, “time” field contains the value when the data point was inserted, how to insert a data point with a specific timestamp?

> insert user_logins,service=payroll,region=use-west value=1 1478177907371000064

> select * from user_logins

time region service value

---- ------ ------- -----

1478177907371000064 use-west payroll 1

Note, there is a whitespace between the last field and the time field, and the time field value should a Unix nanosecond.

> insert user_logins,service=payroll,region=use-west value=1 2016-11-03T12:58:27.371000064Z

ERR: {"error":"unable to parse 'user_logins,service=payroll,region=use-west value=1 2016-11-03T12:58:27.371000064Z': bad timestamp"}

Any timestamp field should be in the format of Unix nanosecond:

> insert user_logins,service=payroll,region=use-west value=1,finishtime=1478177907371000064 1478177907371000064

> select * from user_logins

time finishtime region service value

---- ---------- ------ ------- -----

2016-11-03T12:58:27.371000064Z 1.478177907371e+18 use-west payroll 1

Other timestamp fields seem to stored as integers, it is not difficult to convert them to timestamps after retrieving the data.

What is one-line format?

The example in the InfluxDB documentation makes it clear:

A one-line contains 3-parts, measurement, tags and fields. Tags are separated from the measurement with comma, fields are separated from tags with a whitespace. Tags in the tag set are separated with comma, and fields in the field set are separated with common.

insert user_logins,service=payroll,region=use-west times=9,errors=5,value=9

In this example, user_logins is the metric (a.k.a the table name), tags are service and region, Fields are times, errors and value.

Telegraf StatsD plugin supports multiple tags in a similar one-line format:

echo "user_logins,service=payroll,region=us-south:2|c" | nc -C -w 1 -u localhost 8125

However, it doesn’t support multiple fields, as stated here: https://github.com/influxdata/telegraf/issues/1876. Unfortunately, for unknown reasons, the feature was once included according to https://github.com/influxdata/telegraf/issues/39, then removed.

How to drop all measurements?

drop measurement #NAME# will drop one measurement, but it is not possible to drop all measurements at one time, but this command can achieve the effect:

drop series from /.*/

What is a series?

A series is a collection of data that have the same measurement, tags and retention policy.

> show series from user_logins

key

---

user_logins,host=exp,metric_type=counter,region=us-south,service=payroll

user_logins,region=use-east,service=payroll

user_logins,region=use-west,service=payroll

(the output doesn’t show the retention policy active on the series)

How to output timestamp in a human-readable way?

> precision rfc3339

> select * from user_logins

time region service value

---- ------ ------- -----

2017-01-11T09:38:30.987890238Z use-west payroll 1

To reverse back:

> precision ns

What is a continuous query?

A continuous query creates a measurement, which InfluxDB will insert data into at the group by frequency.

>CREATE CONTINUOUS QUERY "hourly_schedule" on "test"

begin

select count("value") into "hourly_schedule"

from "schedules"

group by time(1h)

end

The above statement creates a measurement "hourly_schedule", as data inserts into "schedules", InfluxDB will down-sample the data and insert it into "hourly_schedule".

This only works for new data inserted into "schedules". For existing data, you just write an ordinary select … into … statement:

select count("value") into "hourly_schedule" from "schedules" where time>0 group by time(1h)

where time>0 is useless here, but InfluxDB throws an error saying “ERR: error parsing query: aggregate functions with GROUP BY time require a WHERE time clause” if you don’t provide a where time clause.

Unlike a SQL in relation DB, InfluxDB “group by” doesn’t support having clause, but with the above created down-sampling measurements, you can achieve the same effect:

select * from "hourly_schedule" where count>5 limit 10

How to check the size of a DB?

> use _internal

Using database _internal

> select count(diskBytes) from "tsm1_filestore" where "database"='pcoe'

time count

---- -----

0 392666

PerfSpy

Wednesday, January 11, 2017

InfluxDB cookbook

How to connect to influxdb

How to check DB objects?

How to check tags?

How to use tag in the where clause?

How to define a retention policy?

How to insert a data point with a specific timestamp?

What is one-line format?

How to drop all measurements?

What is a series?

How to output timestamp in a human-readable way?

What is a continuous query?

How to check the size of a DB?

No comments:

Post a Comment