Count based on the value of a field and index filter in Elasticsearch with Elasticsearch-dsl

Asked

Viewed 44 times

0

I’m using Python3.6, with elasticsearch (7.9.1) and elasticsearch-dsl (7.3.0).

On my index logstash-2020.09.21 i have some documents as below (filtered by the relevant fields):

{
    "subtype": "webfilter",
    "url": "https://play.google.com/",
    ...
}

I can make a request curl and get the information I need this way:

curl -X POST "localhost:9200/logstash-2020.09.21/_search?size=0&pretty" -H 'Content-Type: application/json' -d'{
  "aggs": {
    "urls": {
      "filter": { "term": { "subtype": "webfilter" } },
      "aggs": {
        "count": { "terms": { "field": "url", "size": 100 } }
      }
    }
  }
}'

That returns some buckets as:

"buckets" : [
    {
        "key" : "https://play.google.com/",
        "doc_count" : 30783
    }
]

I would like to know how to make an equivalent request using elasticsearch-dsl, but I’m having difficulty adapting the composition of the query.

Follow what I’ve tried so far:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Q, A, Search

a = A("filter", Q("term", subtype="webfilter"))

client = Elasticsearch()

s = Search(using=client, index="logstash-2020.09.21")

s.aggs.bucket("urls", a).bucket("count", "terms", field="url", size=100)

s.execute()

Whose exit is:

{
    "subtype": "local",
    "url": "/",
    ...
}

1 answer

-1

You can do the query/filters and then perform the aggregation, it looks like this:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Q, A, Search

es = Elasticsearch([{'host': es_settings.ELASTICSEARCH_HOST, 'port': es_settings.ELASTICSEARCH_PORT}])
s = Search(using=es, index="logstash-2020.09.21", size=100)

# Size
s = s[0:40]

agg_urls = A('terms', field='urls')
s.aggs.bucket('qtd_url', agg_urls)
s.execute()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.