Multiple level term aggregation in elasticsearch
If you're looking to generate a "cross frequency/tabulation" of terms in elasticsearch, you'd go with a nested aggregation.
Here's an example of a three-level aggregation that will produce a "table" of
hostname
x login error code
x username
. This is a query I used to generate a daily report of OpenLDAP login failures.
curl -XGET http://localhost:9200/logstash-*/_search?pretty=true -d '
{
"aggs" : {
"hostname_by_login_result": {
"terms": {
"field": "hostname.raw"
},
"aggs": {
"result_by_user": {
"terms": {
"field": "login_code",
"size": 0,
"order": { "_term" : "desc" }
},
"aggs": {
"username": {
"terms": {
"field": "username.raw",
"size": 0
}
}
}
}
}
}
}
}
'
By querying the .raw
version of a field, you get the "not analyzed" version, which means your data will not be split on delimiters.
I also want the output to be sorted by descending login error code, so hence the order option:
...
"terms": {
"field": "login_code",
"size": 0,
"order": { "_term" : "desc" }
},
...
By default, output is sorted on count of documents returned, or _count
. There are a couple of intrinsic sort options available, depending on what type of query you're running.
Written by Jason Ashby
Related protips
Have a fresh tip? Share with Coderwall community!
Post
Post a tip
Best
#Elasticsearch
Authors
Sponsored by #native_company# — Learn More
#native_title#
#native_desc#