How to connect to influxdb
vagrant@exp:~$ influx
Visit https://enterprise.influxdata.com to
register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version
1.1.0
InfluxDB shell version: 1.1.0
How to check DB objects?
> show databases
name: databases
name
----
_internal
pcoe
test
> use test
Using database influxvm
> show
measurements
name
----
cpu
mem
user.logins
How to check tags?
A Tag in
InfluxDB is like an indexed field in a relational table. You typically make a
field as a tag because you need to use it as a condition in the where clause to improve query
performance.
If the name
of a measurement contains dot, you need to quote the name. There seems to be a
lot of issues with dot in measurement or field names, so avoid having dot in the
names.
> show tag keys
from "user.logins"
name: user.logins
tagKey
------
region
service
> show tag values from
"user.logins" with key=region
name: user.logins
key value
--- -----
region us-west
How to use tag in the where clause?
Influxdb support
regex matching.
> select * from user_logins
where region =~ /us.*/
name: user.logins
time
region service value
----
------ ------- -----
1478177907371000064
us-west payroll 1
Regex matching
can also be used in other places:
> SHOW MEASUREMENTS WITH MEASUREMENT
=~ /.*login.*/
name: measurements
name
----
user_logins
How to define a retention policy?
The
default retention policy is “autogent”, which is to keep
data forever. The syntax for creating a retention policy is:
CREATE
RETENTION POLICY <retention_policy_name> ON <database_name>
DURATION <duration> REPLICATION <n> [SHARD DURATION
<duration>] [DEFAULT]
To keep data for 5 days and purge it automatically after 5
days, use:
CREATE
RETENTION POLICY fivedays ON test DURATION 5d REPLICATION 1
To insert data using the default retention policy:
> insert
user_logins,service=payroll,region=use-west value=1
To insert data using a specific retention policy:
>insert into fivedays
user_logins,service=payroll,region=use-east value=5
To select the data point just inserted, you need to prefix
the measurement name with the retention policy name:
> select * from user_logins
name:
user_logins
time region service value
---- ------ ------- -----
1484127510987890238 use-west payroll 1
> select * from fivedays.user_logins
name:
user_logins
time region service value
---- ------ ------- -----
1484127751197523399 use-east payroll 5
To make a
retention policy default, create it with “default” clause:
> drop retention policy fivedays on test
(data
rentented with this policy will be dropped too)
> CREATE RETENTION POLICY fivedays ON test DURATION
5d REPLICATION 1 default
Or use alter command:
alter retention policy fivedays on test default
show shards command will show retention policies
on each database:
name: test
id database retention_policy shard_group start_time end_time expiry_time owners
-- -------- ---------------- ----------- ---------- -------- ----------- ------
125 test autogen 125 2016-09-19T00:00:00Z 2016-09-26T00:00:00Z 2016-09-26T00:00:00Z
695 test fivedays 695 2017-01-11T00:00:00Z 2017-01-12T00:00:00Z 2017-01-17T00:00:00Z
How to insert a data point with a specific timestamp?
In the previous example, “time” field contains the value
when the data point was inserted, how to insert a data point with a specific
timestamp?
> insert user_logins,service=payroll,region=use-west value=1
1478177907371000064
> select * from user_logins
name: user_logins
time
region service value
----
------ ------- -----
1478177907371000064
use-west payroll 1
Note, there is a whitespace between the last field and the
time field, and the time field value should a Unix nanosecond.
> insert user_logins,service=payroll,region=use-west value=1
2016-11-03T12:58:27.371000064Z
ERR:
{"error":"unable to parse
'user_logins,service=payroll,region=use-west value=1
2016-11-03T12:58:27.371000064Z': bad timestamp"}
Any timestamp field should be in the format of Unix
nanosecond:
> insert
user_logins,service=payroll,region=use-west
value=1,finishtime=1478177907371000064 1478177907371000064
> select * from user_logins
name: user_logins
time finishtime region service value
---- ---------- ------ ------- -----
2016-11-03T12:58:27.371000064Z 1.478177907371e+18 use-west payroll 1
Other timestamp fields seem to stored as integers, it is not
difficult to convert them to timestamps after retrieving the data.
What is one-line format?
The example in the InfluxDB documentation makes it clear:
A one-line contains 3-parts, measurement, tags and fields. Tags
are separated from the measurement with comma, fields are separated from tags
with a whitespace. Tags in the tag set are separated with comma, and fields in
the field set are separated with common.
insert
user_logins,service=payroll,region=use-west times=9,errors=5,value=9
In
this example, user_logins is the metric (a.k.a the table name),
tags are service and
region, Fields are times, errors and
value.
Telegraf
StatsD plugin supports
multiple tags in a similar one-line format:
echo
"user_logins,service=payroll,region=us-south:2|c" | nc -C -w 1 -u
localhost 8125
However,
it doesn’t support multiple fields, as stated here: https://github.com/influxdata/telegraf/issues/1876.
Unfortunately, for unknown reasons, the feature was once included according to https://github.com/influxdata/telegraf/issues/39,
then removed.
How to drop all measurements?
drop measurement
#NAME# will drop one measurement, but it is not possible to drop all
measurements at one time, but this command can achieve the effect:
drop series from /.*/
What is a series?
A series is a collection of data that have the same
measurement, tags and retention policy.
> show series from user_logins
key
---
user_logins,host=exp,metric_type=counter,region=us-south,service=payroll
user_logins,region=use-east,service=payroll
user_logins,region=use-west,service=payroll
(the output doesn’t show the retention policy active on the
series)
How to output timestamp in a human-readable way?
> precision rfc3339
> select * from user_logins
name: user_logins
time region service value
---- ------ ------- -----
2017-01-11T09:38:30.987890238Z use-west payroll 1
To reverse back:
>
precision ns
What is a continuous query?
A continuous query creates a measurement, which InfluxDB
will insert data into at the group by
frequency.
>CREATE CONTINUOUS QUERY "hourly_schedule" on
"test"
begin
select count("value") into "hourly_schedule"
from "schedules"
group by time(1h)
end
The above statement creates a measurement "hourly_schedule", as data inserts into "schedules", InfluxDB will down-sample the
data and insert it into "hourly_schedule".
This only works for new data inserted into "schedules". For existing data, you just
write an ordinary select … into …
statement:
select count("value") into "hourly_schedule"
from "schedules" where time>0 group by time(1h)
where time>0
is useless here, but InfluxDB throws an error saying “ERR:
error parsing query: aggregate functions with GROUP BY time require a WHERE
time clause” if you don’t
provide a where time clause.
Unlike a SQL in relation DB, InfluxDB “group by” doesn’t support having
clause, but with the above created down-sampling measurements, you can achieve
the same effect:
select * from "hourly_schedule" where count>5 limit 10
How to check the size of a DB?
> use
_internal
Using database
_internal
>
select count(diskBytes) from "tsm1_filestore" where
"database"='pcoe'
name: tsm1_filestore
time count
---- -----
0 392666