How to index and update a user comparison system?

Asked

Viewed 408 times

7

I have 2 tables with user characteristics and interests in a Mysql database with innoDB

Tabela Carateristica

id
nome
sobrenome
cidade
estadocivil
altura
peso
fisico
pele
olhos
filhos
querofilhos
bebo
fumo
denominacao
frequencia
importancia

Table Interest

id
cidade
estadocivil
altura_min
altura_max
peso_min
peso_max
fisico
pele
olhos
filhos
quer_filhos
bebe
fuma
denominacao
frequencia
importancia

in the previous topic, the best option was to create a table keeping users and their compatibility.

Compatibility table

user_id
user_interesse_id
compatibilidade --> Valor de 0 a 100 que representa uma %

The calculation is made like this:

IF($caracteristica==$Interesse)
    $compatibilidade++

After that take the final value and divide by number of characteristics and multiply by 100 thus having a % of compatibility and then storing the percentage INT

($compatibilidade/Num_car).100;

Now the question is: "How to index and update these comparisons in a way that does not generate more requests than necessary?"

@Marceloaymone gave the idea of a Storedprocedure that I still don’t quite know what it is.

Any suggestions?

  • What is the bank’s technology?

  • mysql use - innoDB

  • An SP is a good idea when the execution of a query is recurring

  • BTW the idea of a table to relate the interests x characteristics of people does not seem a good idea because it will grow squared in relation to the number of people. For 1,000 people we will have 1,000,000 compatibilities

  • exactly why I’m looking for alternatives... but what exactly is a Stored Procedure?

3 answers

2

For each n users we will have n-1 comparisons, I did not see the "sex" in the tables, an application of this type seems important to me.

I would make a table where I would save new users or have had some data registered, a batch(1) routine (triggered by Event) would make the comparison against the n-1 users and write to for compatibility User x Compared and Compared x User because what is true for user x >> user y is not true for user y >> user x. SQL itself seems simple : attribute = attribute or attribute between attribute range the question would be to "populate" the relationship tables.

(1) In fact two procedures :

1 Read table row users to process Calls routine saves Attributes passing user as parameter Delete user from table to process

2 Calculates compatibility pair for each other base user Records for

  • It is a little complicated to understand your explanation, but from what I understand you propose to create 2 tables, 1 to record the result of the comparisons of users and another to record the final result of compatibility is this?

  • I think anything online will be heavy because we will have 2X(n-1) comparisons between users. So there would be a "draft" table powered by Rigger with new users and changed users, a Procedure fired by Vent in an hour of machine idleness would populate the relationships between users, 2(N-1) because I believe that the relationship is not symmetrical, a->b <> b->a

  • The idea is good, I will try to apply to the project and see if it is possible

2


I’m guessing this is some kind of social network, right? I can’t see how viable algorithms can compare users one to one. It would be very inefficient to perform this on a Rigger and too "time consuming" in a background process (how many minutes should the user wait to see the result?).

I believe that the ideal solution is a type of pre-classification that, applied to a given user, returns a value that can then be compared to other users.

I’ll try to illustrate that.

First, let us take as a basis the height. Imagine that we want to approach people with similar heights. We can establish ranges of heights, for example, range 1 for persons considered "low", 2 for "medium" and 3 for "high".

To the eye color, we could have the value 1 for bright eyes, 2 for brown and 3 dark.

To marital status, smoke, drinker and some other attributes that may have only two states, one can adopt the values 1 and 2.

After this classification, one can think of an algorithm that, from a set of interests, returns the most suitable profiles.

The simplest could be a query that compares each attribute and returns first those with more similarities.

The query example below sorts the profiles by similarity, whose value is calculated by adding 1 for each common attribute, that is, the more common attributes, the higher the value of the column:

select c.*,
( 
  case when faixa_altura = :faixa_altura_interesse then 1 else 0 end +
  case when tipos_olhos = :tipo_olhos_interesse then 1 else 0 end +
  case when bebo = :bebo_interesse then 1 else 0 end +
  (...)
) semelhanca
from caracteristicas c
order by semelhanca desc

Another numerical way to do this (which I found well explained in this SOEN question) is to consider all these features as multidimensional axes as in a Cartesian chart. Then, we find the similarity between interests and profiles through the "location" of the profile.

Consider the image below (source):

Gráfico Peso x Altura

The characteristics of each profile are represented by a point, right? So, to find similar interests, just recover the points closer to the interest.

The difference is that in your case the graph would have N dimensions, being N the number of attributes.

Another factor to consider is to put weights in these characteristics. For example, the fact of drinking or not may be more important to match the profiles than the height. To do this in the query above, just use instead of 1, a higher value according to the importance of the characteristic.

Going a little deeper, if you need more optimization and a query is not feasible, you can establish profiles of people. When a user fulfills its characteristics, the system classifies it into one of the registered types. This is just one more type of abstraction to simplify the data structure. The more profiles there are, the more refined the result will be. This is the most "thick" but more efficient method. The weakness is that if someone doesn’t fit a pre-established profile, the chances of not finding anyone compatible will be higher.

  • 1

    excellent :D, I liked the idea of the points in the Cartesian plan, but to do this with many dimensions would be too confusing for me, already the idea of pre-established profile patterns could be easier to implement and can be applied later a query to further refine the result, really is something to think and organize well... answering your question, is a kind of Christian dating site... but it relies on the site owner making money... the free user simply can’t do anything...

1

Ideal for your case is to use a Trigger.

In reply to his comment, a Stored Procedure is a set of stored SQL statements that can be executed through a simple call. They can receive parameters and have some dynamic behavior.

More or less on the same line, a Trigger is also a stored set of instructions. The difference from it to the Stored Procedures is that Trigger is called automatically because it is tied to some database operation involving a table (entering data into a table, for example).

The example below will make it clearer what I mean:

DELIMITER $$
CREATE TRIGGER antesDeUpdate_caracteristica 
BEFORE UPDATE ON caracteristica
FOR EACH ROW BEGIN
UPDATE compatibilidade
SET compatibilidade = -- Coloque o valor final aqui
WHERE user_id = OLD.user_id; END$$
DELIMITER ;

That one Trigger ensures that every update in the table caracteristica, the table compatibilidade will be updated.

I don’t know how you calculate it, but if you put the details of the compatibility calculation in the question, I can improve the answer.

  • I updated with the calculation, and I already have several triggers created to automate various things, the problem here is where to do the calculation, if I have to pull all users in php, perform the calculation and record the result this will generate millions of useless querys... this is the problem, would like a way to let SQL calculate per account in each update and Insert

Browser other questions tagged

You are not signed in. Login or sign up in order to post.