How to do a random query in Mysql without repeating data?

Asked

Viewed 1,079 times

5

I’m creating a site type images type Pinterest that, in addition to other things, need to combine the following features:

  1. Infinite Scrool (I’m applying this code here: https://www.youtube.com/watch?v=_D-CPBvqQaU).

  2. A menu to choose in which order to display images.

The problem happens when I choose to display the images randomly: they repeat themselves.

The query I’m using is as follows::

$return = $database->query("

    SELECT 
        lk_post_pic.*, 
        tb_post.head, 
        tb_post.created_datetime, 
        tb_pic.album, 
        tb_pic.file, 
        tb_pic.thumbnail

    FROM lk_post_pic
    JOIN tb_post ON lk_post_pic.fk_post = tb_post.id_post
    JOIN tb_pic  ON lk_post_pic.fk_pic  = tb_pic.id_pic

    WHERE tb_pic.thumbnail = 'default'
    ORDER BY RAND()
    LIMIT 20
    OFFSET $offset

")->fetchAll(PDO::FETCH_ASSOC);

Each time the scroll is scrolled to the bottom, this query is again executed and the $Return data is inserted into an html tamplate, through echo, which is received by the jQuery function as a GET sponse and then the append method is executed().

NOTE: I hope I’m not complicating with these ramblings, it’s just to contextualize, maybe I can help.

So, but each time this query is executed, the $offset is added in 20 units, to print the new images, but, as everything is randomized before, the first offset images end up mixing with the second and then the views are repeated.

How to get around problem?

2 answers

10


A simple solution would be to make ORDER BY RAND in the entire table before starting paging, storing only the image Ids. It’s an extra job, but only storing the Ids in the first "past".

Having the list of Ids in an array, you use so on the pages:

SELECT campos FROM tabela WHERE id IN ( lista ) ORDER BY RAND();

Whereas lista is a subset of the page start Ids at the bottom of the page (from $id[iniciodapagina] until $id[fimdapagina]).


Mathematical alternative

You can change the place of the Rand for a previous one, before the pagination, and use thus

ORDER BY MOD(id * $valor, $quantidade_registros + 1) ;

Whereas $valor must be a prime number greater than the $quantidade_registros

See working on SQL Fiddle.

This works because being a prime number, it will never be multiple of the amount, and will always go through all the Ids if you take the total amount of records.

Since integers usually take up little space in a DB, you can take a table with a good amount of primes here:

https://www.bigprimes.net/archive/prime/

Occupying 13kbytes of space you can store the first 6500 prime numbers, which fit in 2 bytes each, giving a beautiful track to use in query above.

  • I really liked the second alternative! It’s very simplified! But there’s a problem: The code you wrote always ranks first the last item on the list. For any prime value I choose for $value, the first line will always be 69. How to fix this? Something else: How do I import this list primes numbers pro Mysql?

  • About importing cousins, or you can make a table of primes and draw one, or you can just use the primes in PHP code, then it’s kind of hard to tell which is the right way

  • I updated, the module value is quantity + 1, not quantity.

  • Interesting! But, I have another question regarding the following: "Since $value must be a prime number greater than $quantity_records" What if, after applying some filters in the table in question, the amount of records is different MAX(id_post)? At first, MAX(id_post) = total amount of records, but after applying some joins and aluns wheres, I can have a table with 500 records and MAX(id_post) equal to only 50. Or, MAX(id_post) = 200 and total number of records = 16. In this situation, how to correctly interpret and apply your formula?

  • If you pick the first about 6500 primes, you will have numbers up to 65000 + -, just draw from the first higher than Count(*) to the last

  • Note that the 2nd solution is worth for very large quantities of images, if it is to make a pagination with 5 selects, probably the idea of cousins does not justify.

  • Is it important that the number of records is equal to the number of rows in the table I am looking at? My queries are super complicated, involve up to Queries, make this account as parameter and then do it again in the sub and again in the main will consurmir enough server resource...

  • Jhenry saw that you accepted my answer, I suggest a good look at the solution of colleague @Pagotti, maybe in your case it will be simpler to implement. If it works well for you, it is even lighter for the server. (But use a higher value of Rand(), 1000 is little. see my comment in his post).

Show 3 more comments

2

Use a fixed value as a parameter for RAND()

A way to make pagination and maintain random order using the RAND() is to pass a fixed number as parameter because that way the order will be the same in all calls and will remain in the pagination. And to generate the randomness you need, you draw that number that you will pass when the user loads the first page.

To control this, one way is to store this value in the user’s session. If your system doesn’t use sessions, you may have to find some mechanism like returning the generated number along with the return from the first page, saving it on the browser side, and when asking the next page to pass that number... There are several ways to do this, choose one that suits your case best.

Using the session, it would look something like:

session_start();
if ($offset == 0) {
   $seed = rand(1, 1000);
   $_SESSION["seed"] = $seed;
} else {
   $seed = $_SESSION["seed"];
}

$return = $database->query("

    SELECT 
        lk_post_pic.*, 
        tb_post.head, 
        tb_post.created_datetime, 
        tb_pic.album, 
        tb_pic.file, 
        tb_pic.thumbnail

    FROM lk_post_pic
    JOIN tb_post ON lk_post_pic.fk_post = tb_post.id_post
    JOIN tb_pic  ON lk_post_pic.fk_pic  = tb_pic.id_pic

    WHERE tb_pic.thumbnail = 'default'
    ORDER BY RAND($seed)
    LIMIT 20
    OFFSET $offset

")->fetchAll(PDO::FETCH_ASSOC);
  • It looks good, you just have to be careful if it does some crazy optimization in some query (the manual makes an observation in this sense). For simple queries the fixed Seed is a good one, and is probably the simplest way to implement.

  • 1

    Just a suggestion, it would be lega a $seed = rand(); without specifying the track, because then PHP already automatically uses the largest possible track.

  • I agree with you. Always have to observe the limitations and do several tests before using a solution.

  • I hope the author tries yours, I have a strong impression that is more suitable for his scenario, as I have already commented.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.