Limit Aggregation by grouped item in Mongo

Asked

Viewed 109 times

3

I have a collection composed as follows, only with much more data.

{
  _id: ObjectId("db759d014f70743495ef1000"),
  tracked_item_origin: "winword",
  tracked_item_type: "Software",
  machine_user: "mmm.mmm",
  organization_id: ObjectId("a91864df4f7074b33b020000"),
  group_id: ObjectId("20ea74df4f7074b33b520000"),
  tracked_item_id: ObjectId("1a050df94f70748419140000"),
  tracked_item_name: "Word",
  duration: 9540,
}

{
  _id: ObjectId("2b769d014f70743495fa1000"),
  tracked_item_origin: "http://www.facebook.com",
  tracked_item_type: "Site",
  machine_user: "gabriel.mello",
  organization_id: ObjectId("a91864df4f7074b33b020000"),
  group_id: ObjectId("3f6a64df4f7074b33b040000"),
  tracked_item_id: ObjectId("6f3466df4f7074b33b080000"),
  tracked_item_name: "Facebook",
  duration: 7920,
}

I have already made an Aggregation that returns me in a grouped form all these data as follows:

{"_id"=>{"tracked_item_type"=>"Site", "tracked_item_name"=>"Twitter"}, "duration"=>288540},
{"_id"=>{"tracked_item_type"=>"Site", "tracked_item_name"=>"ANoticia"}, "duration"=>237300},
{"_id"=>{"tracked_item_type"=>"Site", "tracked_item_name"=>"Facebook"}, "duration"=>203460},
{"_id"=>{"tracked_item_type"=>"Software", "tracked_item_name"=>"Word"}, "duration"=>269760},
{"_id"=>{"tracked_item_type"=>"Software", "tracked_item_name"=>"Excel"}, "duration"=>204240}

The aggretion code is simple:

AgentCollector.collection.aggregate(
  {'$match' => {group_id: '20ea74df4f7074b33b520000'}},
  {'$group' => {
    _id: {tracked_item_type: '$tracked_item_type', tracked_item_name: '$tracked_item_name'},
    duration: {'$sum' => '$duration'}
  }},
  {'$sort' => {
    '_id.tracked_item_type' => 1,
    duration: -1
  }}
)

My problem, is how I can limit to only 2 items grouped from Site (tracked_item_type: "Site") and 2 items from Software (tracked_item_type: "Software")?

1 answer

1

I thought a lot about a simpler way to do this, but apparently the only possible way is to play the result of this aggregation in a collection and then make the filters on top of that stored result.

Keeping the outcome of the report in a new Committee: The "Aggregate" command does not have this type of option, so I can replace it with a mapReduce() that will do the same thing but specifying the "out option":

db.trackeditems.mapReduce(
  function() {
      var key = {'tracked_item_type': this.tracked_item_type, 'tracked_item_name': this.tracked_item_name};
      emit(key, this.duration);
  },
  function(key, values) { 
      var totalDuration = 0; 
      for (var i = 0; i < values.length; ++i) { 
          totalDuration += values[i] 
      } 
      return totalDuration;
  },
  {
     out: "tracked_items_report",
     query: {group_id: ObjectId('20ea74df4f7074b33b520000')}
  }
)

Note: I have no knowledge of the API in Ruby, so I am showing the example only on the command line in Mongo.

Consulting the report

After running this command you will have in your db the tracked_items_report collection with the result of the aggregation. From there you can do any query ordering and limiting the records in order to achieve the desired result.

// Obtendo os 2 tops no tipo 'Site'
db.tracked_items_report.find({'_id.tracked_item_type': 'Site'}).sort({'value':-1}).limit(2)
// Obtendo os 2 tops no tipo 'Software'
db.tracked_items_report.find({'_id.tracked_item_type': 'Software'}).sort({'value':-1}).limit(2)

Yes, it takes a little more work, now it’s 3 bank operations instead of just one (which according to your question would be ideal), but you also earn "for free" an additional benefit which is to have the result of aggregation (possibly a heavy transaction) stored in bank as a cache for later queries.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.