How to get random results in SQL with different Dbms?

Asked

Viewed 948 times

4

Based on this existing question on Soen and wanting to bring interesting and useful content to Sopt I ask this question:

How to get random results in SQL with different Dbms?

I’m leaving an answer ready, but I’ll leave it open if they mess up better or more performative solutions

  • I would very much like to know the reason for the downvotes, after all as far as I understand Answer your own questions is encouraged by the idealized ones of Stack Overflow. Thank those who explain the motivations.

  • 1

    Debate on the question: https://pt.meta.stackoverflow.com/q/7007/3635

2 answers

3


Some of the examples were obtained in this reply from Soen of @Yaakovellis

I warn you that this has not been done performance testing, which can probably be a little tricky if you need to consult a lot of things. I intend to perform own tests and suggest maybe different ways.

Another detail, the use of LIMIT, TOP 1, rownum, etc is only to indicate that there has been limitation of results, after all the intention is to take random results and not just order randomly

Mysql

Select random in Mysql:

SELECT column FROM table
ORDER BY RAND()
LIMIT 1

Postgresql

Select random in Postgresql:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

SQL Server

Select random in Microsoft SQL Server:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

IBM DB2

Random select on IBM DB2

SELECT column, RAND() as IDX 
FROM table 
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Oracle

Random Select in Oracle:

SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1

Sqlite

Select random in Sqlite:

SELECT column FROM table
ORDER BY RANDOM() LIMIT 1

0

In SQL Server, if you don’t need to select all the rows, but only a sample of the rows and performance is important, use the following:

SELECT * FROM table
  WHERE (ABS(CAST(
  (BINARY_CHECKSUM(*) *
  RAND()) as int)) % 100) < 10

The idea here is to generate a random number between 0 and 99 and then select all that are smaller than 10. Thus selecting ~10% of the base. You can change the 100 query for another value if you need to take portions smaller than 1% of the base.

Use ORDER BY newid() may be problematic because ORDER BY causes all rows in the table to be copied to the tempdb and that’s slow because it can be very IO and if the base is too big, it can pop the tempdb.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.