What are build cache structures?

Asked

Viewed 323 times

11

In that question where I wanted to know if I should store settings in a global variable $GLOBALS, I got some very good answers, and there was one that talked about something that I hadn’t even thought about, and I decided to ask that question to see if someone could help me understand it better.

  • What are build cache structures ?
  • What are these structures for ?
  • Are always necessary ?
  • How can I create and use these structures ?

I also ran some terms and found this on Wikiedia.

I read the rules of the site, and honestly I don’t know if this question is too wide, but I hope it’s within the rules.

  • 1

    Consider "breaking" these questions into several, so it doesn’t get too wide as it is.

  • 5

    Someone will give you an answer, the question is interesting. But you find that what was simple and worked very well, suddenly it becomes complicated, then you start looking for solutions to fix the problems caused by the solutions that were given earlier, and so it goes. Not everyone is Facebook. Which by the way started simple, when it got complex, they started to change the language. KISS. Make the site simple and it will not need the complex solutions. When you start "buying" tools not needed, no more.

  • 1

    Hi, the problem is really isolated, I just want to learn more about it in case I ever need to.

  • 1

    "Make the site simple and it won’t need complex solutions. When you start "buying" tools not needed, no more." + 1111111111

  • 1

    @gustavox is a pity that in comment can not give a better context, but you understood the main :)

  • 2

    I didn’t say I’ll use it already, I just want to know what it is and how it works.

  • 2

    No one is qualified?

Show 2 more comments

1 answer

6


The term build cache structures is probably just an expression used in the other question, not really a specific name of some technique or technology.

However, when it comes to the performance of a language interpreted as PHP, one of the biggest impacts on the performance of an application (after, of course, solving architectural problems) is the time it takes the interpreter to load and analyze the entire code for each request.

Defining the problem

Basically, let’s define the main problem as follows:

By default, with each request, a new PHP interpreter is instantiated, it needs to load all the source files used on a system, interpret them and only then execute what is being requested.

On topics, we have the following impacts on performance:

  1. Instantiation of the PHP interpreter
  2. Loading of PHP files
  3. Interpretation and compilation in memory
  4. Implementation of the system

While programmers can invest heavily in item #4, which is what matters, the first three are just a side effect of the language and are repeated over and over again.

For a simple request this may not be a problem, but start multiplying the number of users and soon the server will be more busy with interpreting and reading PHP files than anything else.

One startup can easily start rapid development of a system using PHP. The problem is, by the time she starts scaling the number of clients, she’ll be spending more and more on hardware, maintenance and server management to sustain the inherent slowness of the language.

Solutions

#1. PHP interpreter

In the beginning there was the CGI. In this template, a server like Apache will run the PHP interpreter every call, as if it were a command line call.

Seeing the problems with this, some techniques were created to leave the PHP interpreter permanently in memory and thus avoid almost completely the first step.

The interpreter can be included as an Apache module, which means that PHP is instantiated along with the web server. Due to some problems of this model, the Fast CGI model was created later, where the interpreter is no longer in the same server process, but remains in memory in a separate process.

I won’t go into detail about the post and cons here, the important thing is to understand that having the interpreter in memory greatly increases the performance by avoiding the creation of a new PHP process for each request, which is as if each user runs a new program on the server.

#2 and #3. Loading and interpreting

A great advantage of PHP reading the files with each request is that you simply need to edit or replace them and, voi là, the system has been upgraded "instantly". However, this is also inefficient if it needs to be done thousands of times per second.

Well, even compiled languages need to have the executables read from disk, but in addition, PHP still needs to interpret the files, which means analyzing all the characters, structures, checking errors and, if all goes well, generating a structure in memory to be executed, which is known as opcode.

To solve this, the way many tools use is to do cache of these compiled structures (opcode cache), in order to avoid continuous repetition of both source code loading (#2) and compilation (#3).

Opcode is an intermediate code between PHP and the code that actually runs, so it’s not as fast as native code, but it avoids interpreting the original source code.

If the opcode If recorded on disk, we still have problem #2, where it needs to be loaded every request, so it is common practice to store these structures in memory. This way, after the first request that includes a particular file, that file will already be available in the shared memory.

The problem with this is that everything should be done again each time the process is restarted or the system is upgraded.

Managing opcode cache

To put a PHP file in memory, without running it, you can use the command opcache_compile_file(file) library Opcache.

However, to avoid performance impacts after server reboot or update, files should be reloaded after server startup and before the first request.

This is particularly important when you have one cluster of servers, that is, multiple servers with the same system that can serve pages to users alternately.

Suppose you have two servers. To upgrade the system, you to one of them outside of peak hours, so that only the other over the demand at the time. You update the system on the server, perform the necessary tests, and "heat" the system by caching all the files. Finally, you put the servers back on the air and go to the next one. In theory, users should not realize that there has been an update, nor will there be performance degradation in the first requests after the update.

It is important to mention that there various tools to manage opcode caching, as well as other types of cache. The Alternative PHP Cache (APC), for example, it is an open-source library that is part of the PHP project and serves both to store opcode and to store arbitrary values. That brings us to the next topic.

#4. Caching in the application

Going deeper, it is possible to store some most used objects in cache, this way, besides not having to read the file and interpret it, the system does not need to instantiate and load the data.

Imagine that the system frequently uses collections such as:

  • List of states in the country that never changes;
  • List of suppliers, which changes maybe every month;
  • Backlog list, which is updated once a day;
  • Internal system settings;
  • So on and so forth...

Putting it all together

All caching and preheating techniques (Warming) of the system files in memory, makes the server processing can be concentrated on what really matters, that is, the rules of the system.

In systems with high load, that is, many simultaneous accesses, such techniques, aligned with other performance techniques and other caching strategies can save a lot of time and money, both in relation to the acquisition of new servers, as well as the management and maintenance of the same.

Considerations

It doesn’t hurt to remember that every optimization has side effects.

The complexity of managing caches, for example, can be daunting. Bizarre behavior can occur if this is done wrong, especially when thinking about the possibility that cached PHP code is out of date.

One should never apply deep optimization and caching techniques unless necessary, otherwise one will also be playing money and time off.

  • 2

    Excellent answer! I was beginning to think no one would answer! I’ll leave a few more days before granting the reward, to highlight, because this deserves, very good answer, especially for those who have Sopt as the main source of learning. Thanks!

  • 2

    Thanks for the help @gustavox

Browser other questions tagged

You are not signed in. Login or sign up in order to post.