All data is fetched from the system by issuing queries from the query webservice. Since queries are very flexible by definition, there is only the one service to retrieve data from instead of several services in previous versions of SNAPP.
tsds/services/query.cgi?method=query
Query Structure
A query is composed of 6 principal parts:
1. The fields to fetch, including any manipulation functions on them
get <meta_field1>,
<meta_field2>,
<value_field1>,
<value_field2>...
2. A time clause
between(“<start_date>", “<end_date>")
3. A grouping by clause
by <meta_field1>,
<meta_field2>...
4. Where to fetch the data from
from <measurement_type OR subquery>
5. A set of filters
where ( (<meta_field1> <operator> “foo” <and|or> <meta_field2> <operator> “bar”) )
6. Any limit or order clauses
limit <number_to_limit_by> offset <number_to_offset_by>
ordered by <field_1> <asc|desc>, <field_2> <asc|desc>
A basic query would look like this:
get intf,
node,
aggregate(values.input, 300, average),
aggregate(values.output, 300, average)
between("04/26/2015 13:00:00 UTC", "04/27/2015 13:00:00 UTC")
by intf,
node
from interface
where ( intf = "ae0" and node = "rtr.foo.net")
The above query gets the interface name, node name, and the input and output bits per second for interface "ae0" on node "rtr.foo.net", arranged as 5 minute averages for a day. The "by" clause here is ensuring that all of the results are arranged into a single series - when performing the most basic queries this may seem a little redundant since the result set only matches a single thing, but the power can be seen by modifying the query slightly.
get intf,
node,
aggregate(values.input, 300, average),
aggregate(values.output, 300, average)
between("04/26/2015 13:00:00 UTC", "04/27/2015 13:00:00 UTC")
by intf,
node
from interface
where ( node = "rtr.foo.net")
By removing the "intf" match in the where clause, the query above now fetches every interface on the node "rtr.foo.net" and break them out into their corresponding series of unique node+interface pairings. By modifying the query slightly again we can further change that behavior.
get intf,
node,
aggregate(values.input, 300, average),
aggregate(values.output, 300, average)
between("04/26/2015 13:00:00 UTC", "04/27/2015 13:00:00 UTC")
by node
from interface
where ( node = "rtr.foo.net")
By removing the "intf" field in our by clause, after the results are fetched they are merged into a single series based on the node name instead of individual series based on node+interface. As a result, the above query is getting the 5 minute buckets averaged across ALL interfaces on the node into a single result set.
Queries may also have subqueries where instead of selecting data from a measurement type it is selected from an inner query. This nesting can be arbitrarily deep. Outer queries do not specify a between time or a grouping clause but are otherwise identical to any other query. Using one of the earlier queries as an example, the added outer query here will calculate the 95th percentile of input and output after it has been aggregated into 5 minute average buckets.
get intf,
node,
percentile(avg_input, 95),
percentile(avg_output, 95)
from (
get intf,
node,
aggregate(values.input, 300, average) as avg_input,
aggregate(values.output, 300, average) as avg_output
between("04/26/2015 13:00:00 UTC", "04/27/2015 13:00:00 UTC")
by node
from interface
where ( node = "rtr.foo.net")
)
Data Manipulation Functions
Below is a listing of the manipulation functions available for value data:
- average(<value_field>) - returns the average of the data points for the specified value
- aggregate(<value_field>, <seconds_to_group_by>, <aggregator_function> - applies the “aggregator_function”, which is most of the other functions noted here, to each series of data points grouped by the “seconds_to_group_by" parameter
- percentile(<value_field>|<aggregate>, <nth_percentile>) - calculates the nth_percentile of the value field or aggregate clause passed in
- count(<value_field>|<aggregate>) - returns the number of data points returned in the series for a value field or an aggregate
- min(<value_field>|<aggregate>) - returns the minimum value in the value_field’s or aggregator function’s series of data points
- max(<value_field>|<aggregate>) - returns the maximum value in the value_field’s or aggregator function’s series of data points
- sum(<value_field>|<aggregate>) - returns the sum of the value_field’s or aggregator function’s series of data points
- histogram(<value_field>, <bin size>) - returns an object representing a histogram of the data fit to the specified bin size
- extrapolate(<field>|<aggregate>, <date>|<number>) - estimates what the value’s or aggregate clause’s value will be at the provided date in the future, or when the value will reach the provided number
Time Specification
Time for a between clause may be specified in the following formats. In the case where abbreviated formats are used it is assumed to be local server timezone for TIMEZONE and 00:00:00 for HMS.
- MM/DD/YYYY HH:MM:SS TIMEZONE
- MM/DD/YYYY HH:MM:SS
- MM/DD/YYYY TIMEZONE
- MM/DD/YYYY
Filter Operators
The possible operators for filters in the where clause are below
- = - returns measurements whose meta field or value equals the specified value exactly
- != - returns measurements whose meta field or value does not equal the specified value exactly
- > - returns measurements whose value is greater than the specified value ( can be a date )
- >= - returns measurements whose value is greater than or equal to the specified value (can be a date)
- < - returns measurements whose value is less than the specified value (can be a date)
- <= - returns measurements whose value is less than or equal to the specified value ( can be a date )
- in - returns measurements whose value or meta field is contained in parenthesis surrounded, list of double quoted values specified, i.e. in (“foo”, “bar”, “biz”)
- between - returns measurements whose values are between the two supplied values as a numeric range, comma separated list of values provided, i.e. (1,10)
- not like - returns measurements whose meta fields do not match the double quoted word or regular expression provided
- like - returns measurements whose meta fields match the double quoted word or regular expression provided