4
I need to use the grouping operator $group
of mongodb, but every explanation I find is very confusing.
How this works and what is the benefit of using this operator?
4
I need to use the grouping operator $group
of mongodb, but every explanation I find is very confusing.
How this works and what is the benefit of using this operator?
8
The $group
is one of the stages of aggregate
. The idea of aggregate
is to establish a pipeline of operations on a Collection that will produce a certain output. It is an alternative to map-reduce
offered by Mongodb. In the documentation on aggregation
mongodb, the use of aggregate
is described in pseudo-code as:
db.collection.aggregate([ { <stage> }, ... ])
That is to say, db.collection.aggregate
receives an array of stage
s, stages in the pipeline (such as the $group
). There are several stages described in this link above. The simplest, would be the $match
, that simply filters the results the moment they pass through it to the next stage of the pipeline. For example:
db.collection.aggregate([
{ $match: { nome: 'Wallace' } },
{ $match: { idade: 10 } }
])
You will first filter all documents through the field nome
and then the field idade
. Note that this could be redundant and slower than just running { $match: { nome: 'Wallace', idade: 10 } }
, but Mongodb performs optimizations in the pipeline you define and one of them combines several $match
s in a row.
As to the $group
, the idea is to pass a field _id
, which defines how you want to group the results of your pipeline and several Fields that work on all documents generating some final result. For example:
db.collection.aggregate({
{ $match: { nome: 'Wallace' } },
{ $group: { _id: '$idade', total: { $sum: 1 } } }
})
Will first filter all documents, finding those with doc.nome == 'Wallace'
and then group them by idade
. Thus, all groups of documents of the same age will be represented by a single object, with the format:
{
_id: <alguma-idade>,
total: <0 + 1 para cada documento agrupado (portanto: o total de Wallaces com essa idade)>
}
The $sum
above is a stage operator $group
. It takes some parameter, which can be calculated for each document, and produces the sum of all results for all documents. If we wrote:
db.collection.aggregate([
{ $group: { _id: '$nome', somaDasIdades: { $sum: '$idade' } } }
])
We would receive the sum of all ages for each group of documents with the same name.
The complete list of operators to produce stage results $group
is here:
The value that stands next to the _id
or of $sum
is any valid expression, so it can be:
a literal value, such as 'Wallace'
a path to a field in the documents that are passing '$documento.campo'
an object that applies multiple expressions to specific fields
An example using an object like _id
would be:
db.collection.aggregate([
{
$group: {
_id: {
nome: '$nome',
idade: '$idade'
}
}
}
])
This will create groups (without other fields outside the _id
) of all documents with the same idade
and the same nome
.
There is also a function in REPL db.collection.group
, but she’s just a helper to do aggregate
only with one internship $group
.
I think this gives the basic notion that it is possible to pass quickly. I strongly suggest that you read the documentation of aggregate
that I Linkei up:
About why you use one aggregate
, I think it depends extremely on what you are going to do. Just like the map-reduce
, this is the kind of operation to be done when the amount of data you are operating on is large enough that it is not worth processing in your application code. In such cases, use something like a aggregate
will be more (much more) efficient than pulling a large amount of data into your application and treating them in it.
Browser other questions tagged database mongodb
You are not signed in. Login or sign up in order to post.