What is the best practice for consulting in large tables?

Asked

Viewed 92 times

1

I have a system composed of several tables.One of the tables is called Publications and contains 15 varied fields, 3 of which are the main ones of my doubt:

 1. titulo - varchar(100)
 2. subtitulo - varchar(200)
 3. texto - text

In a short time (18 months +/-) this table will have more than 1 million records and we are concerned about the performance of searches for records.

What is the best way to make queries by keywords inserted by users in a form, using the LIKE or FULLTEXT or whatever?

We are concerned with searches for compound words such as "house music" and mainly speed and process performance.

Where I start, I’m used to darlings simpler and less impactful and I’m not able to handle all the variables involved.

  • I usually follow this when the going gets tough: 10 techniques for optimizing SQL statements

  • @Marconi the type of database and language are important in the question, you do not agree?

  • Because it is a historical series, partitions are usually made by date, keeping only the most recent records in the main table and the oldest ones in the secondary tables or partitions. In Sql-Server I know that there are tools that automate the creation and record loads of these partitions. If these functions do not exist in Mysql, you can do this manually. That is, migrate the old records and separate the queries, making the second query only if necessary.

1 answer

4


"Best practice" is to create a development environment that simulates the situation you will find and create solutions to see what is most suitable.

The simplest solution is a LIKE, try it and see if the results are satisfactory. If they are not passed to a dedicated text search system and configured properly.

It seems to me that performance should not be a problem in reasonable machines. I would need to see if it will have a lot of concurrent access or little, if it will be only local. Developing the application correctly also counts. The correct architecture can weigh more than choosing this or that resource from the database.

Depending on the type of answer you want you would need a specialized textual search engine.

  • We made a test environment with 2.5 million records and the queries are taking about 3.5 seconds (too long). In addition to the results being inefficient. We would need full-text anyway, at least we think. But we didn’t fully master this...

  • You need to see if there is no way to optimize this before. Either way is worth the test with FTS.

  • We’ve searched a few things around here. We’ll shift focus to fulltext search. In addition to better performance we get more powerful results for the user.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.