PHP and Mysql, how are thousands of connections processed at the same time?

Asked

Viewed 4,785 times

20

Using PHP and Mysql as an example, let’s assume that there are 1000 connections (clients) accessing the site and using functions that open connection to Mysql and make queries. How PHP and Mysql process so many connections and queries at the same time?

If there are 1000 connected clients on the site requesting Mysql, then there are 1000 Mysql connections?

3 answers

20


P: "If there are 1000 connected clients on the site requesting Mysql, then there are 1000 Mysql connections?"

R: Only if the 1000 start at exactly the same time a navigation on the pages that use connection with DB. After loading the page on the screen, the connection has already ended.

Normally the connections are made at the beginning of the script that serves the page, and closed right after. It is normal to have 1000 people using your site simultaneously, but in practice only 15 or 20 simultaneous connections. The time spent by browsers is much longer in viewing and reading pages than by clicking on links and uploading data, and it is very unlikely that these clicks will happen at the same time frequently.

I usually say because the nightmare of hosts of sites is the one of the persistent connection, which is opened in a script and kept in some session object for other pages to use. It is not a recommended practice because it impairs the performance of all DB server applications, but it is still possible to do it on many hosts.


And if I really need many simultaneous connections?

There are many techniques to make the load distribution, when the application demands many simultaneous accesses, but there we are talking about only 1000 simultaneous connections. You can add a few more zeros to this number.

Here are some very common:

  • Mirroring / database replication

    In this case, you have several DB servers, and when you change something in the main one, this data is copied to the Slaves. The advantage of this method is that you divide the clients by multiple servers, effectively multiplying the capacity by the number of instances serving the DB. Disadvantage: storage space is proportional to the number of servers.

  • Data distribution

    Using techniques like map/reduce, you have the data distributed across multiple machines, but without mirroring. Part is on one machine, part is on another, and its query is sent to several places at the same time, but only the "father of the child" returns the answer. In this case, you already need a more advanced data management project. The advantage of this method is that you don’t occupy space with redundant data. The downside is the greater complexity in architecture. Example: Google.

  • Caches / buffers

    You can retain part of the data in the application, so that no queries can be made on the server at all times. An example of this case would be a data pagination where you upload the data and keep it in the application cache, or even in the browser, so when one advances or returns pages, the data does not need to be reloaded. Example: listing domains of a Registry.br user - your domains are loaded into a single JSON, and when you go forward or back page, you are only changing the view, not reloading the data - Note that this is also a kind of Early loading, see below.

  • Prediction / Early loading

    This case is a double-edged sword. Here you gain performance, but at the risk of carrying data unnecessarily. The technique is to load more data than you need, so when you go to the next page, or scroll a listing, the data is already there. The problem in this case is that you are usually always carrying more data than you need. The advantage is that you have done this by occupying the time when the user is analyzing the previous data.

  • Remembering that the technique of Cache / buffer can also be a double-edged knife, since when bringing an entire block of data (for paging, for example) it is assumed that the user will consume this entire block, which is not necessarily true.

  • @Kazzkiq actually think I got a bad example, because the case of the.br record is a mixture of cache and prediction, if you like. I’ll think it over, and on the next Edit I put a pure example cache.

  • I found the example for good paging, it is a really useful thing, but speaking more in practice, it can be a shot in the foot. Look at Globo.com for example, the site’s comments all come together in a JSON block. Then you get news with up to 10,000 comments and imagine the trouble it takes for the bank to bring it all, when for sure users won’t even read 10% of it. Of course it’s an extreme example and they should have n ways to optimize requests, but it’s still worth studying each case before using this solution. Anyway, just a small addition to this great answer!

  • @Kazzkiq agree, only I think the pagination I should have put in the example below, which is more the case. From the cache, it would have to be a pagination already seen, and not load the following (like globe and.br record). The cache would actually keep the data that has already been viewed, and not necessarily load the following (as well as a browser history). But I’ll think it over and move it only when I find clear examples.

7

If there are 1000 connected clients on the site requesting Mysql, then there are 1000 Mysql connections?

Not necessarily. Even if your database server supports a thousand connections simultaneously making requests, it would be a huge waste of resources if these requests were asking for equal data.

For example, if your site receives 1000 data requests over a certain period of time, it is likely that many different requests are asking for the same data. It is still very likely that during this period of time these data have not been modified. Considering this, it becomes a waste of resources, consult the bank looking for something already consulted and that has not been changed. The solution to this type of situation is the caching of data.

A database server with caching implanted, stores in memory the response of each new query as they are requested, so that a similar future request does not need to go to the database to retrieve the data. The data will already be available in memory and a new query only happens if a previously consulted data has been modified or if it has not been previously consulted and so is not in memory, is not in the cache.

Therefore, on a server with a caching deployed, 1000 requests means 1000 queries only if they ask for different data.

  • Mysql already comes with this caching technique enabled by default?

  • 1

    Depends, you can check with the query: mysql> show variables like 'have_query_cache'; More information (in English): http://www.howtogeek.com/howto/programming/speed-upyour-web-site-with-mysql-query-caching/

5

Short answer: Depends a lot on the structure, such as bandwidth, amount of RAM among other factors.

The maximum number of connections Mysql can support depends on the quality of the thread library on a platform, the amount of RAM available, the amount of RAM that is used for each connection, the workload of each connection, and the desired response time.

Linux or Solaris must be able to support something between 500~1000 simultaneous connections routinely and up to 10,000 connections, if you have many gigabytes of RAM available and each one’s workload is low or the response time little demanding.

MySql - Muitas Conexões

In relation to PHP, believe that there is a limit, you can open n connections, however, the database server and application server endure.

  • If you exceed these numbers, then connections and queries enter a queue type? Type "thread blocking" I/O?

  • 1

    @Gustavopiucco I believe not, is returned an error, something like this Too many connections see this one image To solve a problem like this, it would be necessary to manipulate the variable max_connections to set the maximum amount of connections.

  • From what I searched here, Mysql creates 1 thread for each connection and within each connection 1 thread for each query. Strange, wouldn’t it be better to operate with a minimum number of threads (non-blocking I/O) the same on servers that need to support thousands of simultaneous connections? I once read that using more than 10 threads per application is an improper technique/practice.

  • 1

    You are correct, the threads created that are associated with the client connection are created only when needed. A good text to read on the subject is this How Mysql Uses Threads for Client Connections.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.