How to generate reports using Nosql?

Asked

Viewed 510 times

12

i own a Message document and in my Nosql database I can have thousands of messages as follows:

[{"origem":1,"destinatario":1,"created_at":"2016-12-20","conteudo":"P ligula pellentesque ultrices"},
{"origem":1,"destinatario":2,"created_at":"2016-12-21","content":"Vestibulum ante ipsum ."},
{"origem":1,"destinatario":3,"created_at":"2016-12-20","content":"Aliquam sit amet diam in ."},
{"origem":1,"destinatario":3,"created_at":"2016-11-20","content":"Aliquam sit amet diam in ."},
{"origem":1,"destinatario":4,"created_at":"2016-10-20","content":"Aliquam sit amet diam in ."}]

I have a question to generate reports for example: Number of messages sent in month 12, Number of new recipients in month 11.

Number of messages in the month is easy just go through all the messages and compare if it is of a certain month and I count, but to check the amount of new recipients in month 12 I have to go through all the messages and check if the recipient is in previous messages etc. It is something that would take a lot since I can have multiple recipients and thousands of messages.

I don’t want them to do it for me, but I think there’s someone more feral in Nosql who can give a hint of some tool that manages it more easily or to model my database otherwise using a relational database.

  • 4

    The question I would ask is why use Nosql in a case where clearly the information is relational. (although the name Nosql already causes confusion, because whether or not to use SQL should have nothing to do with being relational or not - DBF, for example is relational, but does not use SQL)

  • 1

    @Bacco the system uses Nosql because of the hundreds of connections per minute. As relational database is blocking so this was the best solution found.

  • 8

    I really suggest a revised one on all the concepts. I don’t know where you read all this "blocking" stuff, but maybe it’s not exactly what you understand. Nosql does not bring any advantage in this sense (the application is yours, of course, do as you think you should. the suggestion is only to avoid solidification of wrong concepts).

  • 3

    Related http://answall.com/q/96409/101, http://answall.com/q/122452/101 and http://answall.com/q/14533/101

  • 1

    Thank you guys, I will give a studied on the subject. o/

  • 3

    Emilia, I use a perfect toolkit for this type of Ports, ELK from Elastic.co https://www.elastic.co/webinars/introduction-elk-stack You can use Logstash to consume your BD and play for Elastisearch, Kibana will be used to generate your Ports. Any questions just ask.

Show 1 more comment

1 answer

0

His example of messages is a JSON where there is an array containing several objects. I assume that the storage unit in your database is each object one of those objects within the array, right? Since it’s a document, I assume, too, that this Nosql you’re talking about is a Mongodb, am I right? In general, when we talk about such databases, the data structure always interferes with the way we read the information, in fact, its problem is related to this. To solve this problem, I wouldn’t use Mongodb, I would use Cassandradb, but I would create more than one "table", one would be the one you’re mentioning and the other exclusive to solve your problem, where the key would be the recipient and would only need one value, the date of creation of the first message. Taking a stretch of a documentation cassandra’s:

Application-side joins can be a performance killer. In general, you should Analyze your queries that require joins and consider pre-computing and storing the Join Results in an Additional table. In Cassandra, the Goal is to use one table per query for performant behavior.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.