How does the HTTP protocol process requests?

Asked

Viewed 1,426 times

34

HTTP methods are used to send and receive data from the server, such as the GET (recovers data) and the POST (sends the data).

Following the above definition to understand very well the purpose of the GET and of POST, however I have a pertinent question regarding how HTTP processes the request according to the verb HTTP.

Doubt

How the HTTP protocol processes (behind the scenes) requests, that is, how the client and server communication occurs through the request methods?

  • 11

    Just in advance: the HTTP protocol does not process anything. It only regulates how the data should be communicated. Who processes HTTP is the application.

  • A.pt OS response involving the HTTP protocol: http://answall.com/a/66765/8493 also has 2 links to the subject.

3 answers

37


First of all it is good to understand HTTP as a series of format conventions to be used over a common TCP connection. In principle it’s a protocol stateless where you basically send one text and get another back.

In other words, HTTP does not process anything, but sets a format. It is the responsibility of the application that meets the request to process the data, and provide a response consistent with the protocol.

To Wikipedia even has a reasonable definition of HTTP, but I will try to highlight the most relevant points for the question in a simple way right after.

If you want to delve into specifics after understanding the basics, follow the link to Consortium W3, which is responsible for officially defining and regulating the:

https://www.w3.org/Protocols/


Format

An HTTP request is in principle a mere text stream, characteristic for this format (each row of the table is a line of text, broken by CR + LF:

            | REQUISIÇÃO                      | RESPOSTA
------------+---------------------------------+--------------------------
CABEÇALHO   | METODO CAMINHO PROTOCOLO/VERSAO | PROTOCOLO/VERSAO STATUS
            | Cabeçalho 1: valor1             | Cabeçalho 1: valor1
            | Cabeçalho 2: valor2 ...         | Cabeçalho 2: valor2 ...
            | Cabeçalho N: valorN             | Cabeçalho N: valorN
linha vazia |                                 | 
CORPO       | DADOS DO PEDIDO                 | DADOS DA RESPOSTA

Exemplifying

When you access in the browser a site like www.exemplo.com.br, your browser solves the address (transforms www.exemplo.com.br at an IP address, using DNS).

It then connects via TCP to the IP obtained, in principle at port 80 (which is the HTTP standard, with 443 being the HTTPS standard). Once connected, it will send something like this, verbatim:

GET / HTTP/1.1
Host: www.exemplo.com.br

and that’s it. Assuming there’s a page in the requested address, you’ll get something like this back:

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Encoding: UTF-8
Content-Length: 351
Connection: close

<html>...

Note that it’s almost the same thing, only in the answer instead of you having METODO CAMINHO PROTOCOLO you have PROTOCOLO STATUS DESCRICAO_DO_STATUS in the first line.


Methods

Once we understand the basic part of the protocol, let’s see what changes instead of a GET have a POST. In the case of POST we have some extra information to send, and as described above, we use a blank line to separate the contents from the header:

POST /formulariodeinscricao.html HTTP/1.1
Host: www.example.com

nome=gato&senha=secret

Note that the keyword here is the POST, the information of the first line, and in the body of the request (after the blank line) we have the values sent. In case I used the most common format of Web forms, the format may vary depending on the case.

If it was a file upload with PUT might as well be so:

PUT /upload.html HTTP/1.1
Host: www.example.com

DADOS_DO_ARQUIVO..........


Test tool

There are some online tools that help a lot to investigate and "debug" HTTP requests, one of them is this API, with several endpoints, which show a diversity of information, which helps a lot to test separately the sending and receiving part of your application:

https://httpbin.org/


More references

What is a "stateless protocol", like HTTP?

https://pt.wikipedia.org/wiki/Hypertext_Transfer_Protocol

21

Protocol

As Bacco has already well said in the commentary, protocol is a specification, so he doesn’t process anything.

pro to co lo |ó| male noun

  1. Form.

  2. .Minutes of conferences between plenipotentiary ministers of different nations, or between members of a congress international.

  3. .Record in which the court clerk reports what happened at the hearing.

  4. Regulation observed in some . public acts.

"protocol", in Dicionário Priberam da Língua Portuguesa [online], 2008-2013, https://www.priberam.pt/dlpo/protocolo [consulted in 29-12-2016].

For our case what fits most is the fourth item. It is a set of rules that indicate how communication between computational entities will "talk". They need to speak the same language, they need to use the right words in the right places to understand themselves clearly, completely and unambiguously. " HTTP protocol" is a pleonasm since the acronym means Hypertext Transfer Protocol or Hypertext Transfer Protocol.

Specifically HTTP is something that regulates the network application layer using web technologies. These rules are defined by World Wide Web Consortium.

HTTP is basically plain text (HTTP 2 can be binary) with a header saying what it is and then content that is wanting to communicate effectively (before it had to be always text, so binary content, like an image, needed to be converted to text). That’s part of the protocol. What this header should have, what is required, what is optional, what should be in a request and what should be in a response, even define these terms, all this is in the protocol. Among the rules there is the indication of what to do with the cases of errors, the format of each part, etc.

Of course among these rules is what to do with each field, each verb, the error codes, etc.

A list of regulated fields is in Wikipedia.

Example of request header:

GET /hello.htm HTTP/1.1
User-Agent: User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: www.seusite.com
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Note that the protocol does not want to know anything about how the communication itself is being done, it is another protocol problem that will probably encapsulate the HTTP package. HTTP only takes care of that which is there and down here.

Example of HTTP response:

HTTP/1.1 200 OK
Date: Thu, 29 Dec 2016 12:02:52 GMT
Server: Apache/2.2.22 (Win32)
Last-Modified: Wed, 28 Dec 2016 13:16:38 GMT
Content-Length: 88
Content-Type: text/html
Connection: Closed

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>

Applying

An application that wanted to conform to HTTP must follow these rules. In some cases the application and the environment where it runs need to process other protocols in different layers, this is called osi model.

The application takes this "text" and processes it. That is, it makes a Parsing of what you find there and decide what to do with the information found (what will be the semantic analysis). There is also the sending of "text", there he just has to assemble it according to the rules, put everything in the necessary order, jumping line, having the expected texts, etc. This will be processed on the other side. In general you have a client who makes the requests and a server that provides the answers.

How the app will do that is her problem, she just needs to respect the rules.

Fits more specific questions.

8

HTTP is the theoretical model for hypertext communication systems. By comparison, it is a list of rules so that two foreigners can have a successful communication. Total communication is all architected in a model called OSI (Open Systems Interconnection). HTTP believe it acts on the third layer (correct me).

  • 7 Layer of application;
  • 6 Layer of presentation;
  • 5 Session layer;
  • 4 Layer of transport;
  • 3 Network layer;
  • 2 Layer of linkage of dice;
  • 1 Physical layer.

Your question implies two paths:

  • 1) How hardware processes data communication?
  • 2) How the server receives data that the operating system receives from requests?

The first question is an electrical engineering issue, and I believe I’m not part of the stack overflow culture. The second question is related to the understanding of how the operating system works. There is the kernel, which has closer communication with the hardware, and the auxiliary programs, which actually make the system 'operative'. The kernel receives the hardware data, processes it in its auxiliary programs, until it reaches the seventh layer of the OSI model. The data is organized in packages, which has a header (header) with destination and source information. Here at this time is that GET/Posts requests are approved or refused requests for physical reasons. Data approval or refusal via GET/Posts for logical reasons should occur from the fifth layer onwards. Because of this delicate and direct relationship, an operating system is relevant to security. If I used any non-genuine concept from an engineering point of view, it was meant to be didactic. Feel free to edit and correct.

Recommended reading: https://en.wikipedia.org/wiki/OSI_model

  • 1

    Nope, http is in the application... And got a little confused this mix of hardware, kernel, good at least for me...

  • Imagine you are using a program that communicates with the keyboard. Even a simple task like typing has software to interpret this action. Now imagine that a program wants to access the same memory address of that keyboard, and even delete it. We would have a problem. Therefore, it is desirable that the O.S. has more access to the hardware than the applications. Some high-level languages use pointers to create memory addresses. In assembly languages and low-level things can get really hairy.

  • What I meant is that it is unnecessary to quote the hardware, since the communication takes place in the application, both the request as to the response occur in the application, to reach that point of the process it is assumed that the S.O. has already done its part of it...

  • 1

    I quoted the hardware because the author’s question was not very specific. When he says "under the table", he gives multiple interpretations. It suggests he’s curious to know what’s going on in the lower layers. Speaking of "processing" http requests, he may have had a false impression of the role of the http protocol, thinking it works in the lower layers. I thought I was acting in the fifth layer myself, and you corrected me. He knowing the role of hardware and O.S., can facilitate understanding that http does not process anything.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.