How does logic work to figure out the size of the image?

Asked

Viewed 159 times

8

In PHP and other programming languages, it is possible, through some special functions, to discover the image size.

For example:

list($width, $height) = getimagesize('images/icon.png');

I was curious to know how languages do this. Where does the information that the image has that size come from?

I wanted to understand how it is made the reading of this information.

My doubt arose because when I asked that question How to know the number of lines a large file has in php?, I realized that in my tests, large files took too long to be processed.

But when I did the test with images large and small, to know the size of them, the response time was the same. So, apparently, the function responsible for capturing this information didn’t read the whole file, but returned this information from somewhere.

And what I wanted to know is where this information comes from?

How is it possible to know, for example, the mime of an image with such agility? I say "agility", because even if an image had 20MB, the speed of reading this information is always the same.

Where is this "information" stored? How is the image size processing done?

  • 2

    The key to understanding is contained in metadata. The answer to the question is not complex but to be complete is long...

1 answer

12


Obviously the file has metadata with relevant information. Normally it is a header with the data, its format varies in each type of image according to some normally public specification so that everyone can develop their algorithms to pick up/manipulate the information you want.

In addition to the basic data that form the image itself, it is common for the file to start with a "signature" so that it can be easily recognized as being of that format. It is common to have a versioning in this signature.

Several existing libraries already do what you want and are available for various languages, such as PHP. The language does nothing, but this code that can even be part of the standard library. Whoever did it certainly studied the specification and created the necessary code. In the background just read a few bytes at specific file positions.

The algorithm probably analyzes whether the data is well formed. Some formats may help this type of verification (with CRC, for example), others are more susceptible to corruption. And confirming the comment below: yes, it’s boring to do right, why you see people using things ready.

Actually the technique is valid for any type of file, not just image. Whenever you need to recover information on complexity O(1), and it can be obtained beforehand, store it somewhere, preferably at the beginning and in a fixed position to guarantee O(1) "firm".

Trying to discover the information on your own will likely have complexity O(N) which is much worse, though not tragic.

In the case of GNP there is a Chunk called IHDR starting with byte sequence 73 72 68 82 and then the following structure with the data referred to in the question:

Width               4 bytes //tamanho aqui
Height              4 bytes //tamanho aqui
Bit depth           1 byte
Colour type         1 byte
Compression method  1 byte
Filter method       1 byte
Interlace method    1 byte

There’s a website with file formats. I don’t know if it’s good.

  • 1

    If the link did not point to where it points would say that it is relevant to the detailed answer.

  • 3

    I know the bigown @Magichat, it will edit the answer a lot of times to add more info :D

  • 1

    @Nice wallacemaxters... It’s a great question, as far as I know you, will surely answer the height.

  • 1

    Great answer. That’s right, image files have the information in a header. I agree with @Magichat, I think it’s worth putting an example of the relevant part of the PNG header in the body of the question. :)

  • 1

    And, who knows, also the links of other specifications, as the JPG, GIF and BMP (the most zoned :p ).

  • Who stops to discover this information, huh

  • I can improve more, you can suggest

  • Good @bigown by the profile of their responses, I suggest giving an incremental addressing a little about the ""Exif", through PHP, or for Windows fans the GDI, or the 2, my suggestion.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.