Skipping route due to poorly formatted parameter is a syntax error?

Asked

Viewed 401 times

10

For example, I have the route GET: /user/{id}, id passes a regular expression validation of type [0-9]+. When performing requests with the verb GET for the following Urls:

  • /user/17, returns user data id equal to 17

  • /user/null, returns the 404 status, as if the route did not exist

Would that be a syntax error? 400 should be returned?

And in the case of instead of a parameter in the invalid URL, a parameter in the body, for example, a route that needs two parameters, oldPassword and newPassword, but the requisition only has newPassword?

  • 5

    If the request is wrong/incomplete, 400. "Bad Request" is just to answer a "look, the error was on your side, not on the server. Turns to correct".

  • @Andersoncarloswoss then both cases I cited would be incorrect return 404?

  • In my understanding, yes. The route exists and has been found, it makes no sense to answer with 404.

4 answers

8


The URL is an opaque value by definition, meaning that it does not necessarily reflect the structure of your application; so much so that accessing /user/1 is not necessarily access the file in /user/1/index.html, the URL may not represent the folder organization (or it may, as is common for static files).

That said, the final answer to your question is: it depends on the requirements of your application.

We cannot say what is right or wrong, because in one application it can make sense and in others it cannot. Being the URL opaque, in different applications it can represent different resources and thus require different responses.

Let’s start with the counterargument of your question: we have an application where I can access a user’s information both with /user/1 how much /user/anderson-carlos-woss. It turns out that in the first case I informed the user ID, while in the second I informed the name. In my application I guarantee that both will be unique for each user. IE, access the resource /user/null, for example, it would make my application reach the correct resource (users), there would probably be a condition that would check if the value is a number; if it is, search for the id, otherwise search for the name. In this case, the user with the name Null would be searched from the bank and if not found would generate an error response.

Do you realize that in this case the feature of the application was identified correctly, the whole search was done and did not find the record in the bank? For this situation the indicated response would be the 404 Not Found, because its application managed to process the request successfully, only did not find the resource that the client requested.

In your question you quote:

... id passes a regular expression validation of type [0-9]+

And that’s what defines which one to use. Your application requirements require that the id provided by the URL must be a non-negative integer value. If the customer requests the resource /user/null, it is more interesting you inform him that the request is wrong and that he needs to fix it before trying again. See, it’s a problem in the request, not in the application. Errors in the request are reported with the 400 Bad Request response, which basically tells the client "man, your request makes no sense, I don’t know what to do with it".

In short:

  • Reply with 404 Not Found when the route exists in that format, but the resource in question was not found;
  • Reply with 400 Bad Request when there is no route in that format;

So is it wrong for me to send the 404 answer in this case? No! It depends on your application. Some applications choose to send the 404 response even if the request is wrong to hide the structure of the application itself. Let’s say a user with bad intentions tries to hack the resource /user/null and receives the answer 400 he will know that the request is wrong and will try to make the attack on a similar resource until he gets another answer; while if he gets the answer 404 he may find that the appeal does not exist and give up the attack. It depends on what resource we are dealing with, what context it will be used for and what the application requirements are.

My view is quite simple: the application should be protected against attacks on any resource, it is not an HTTP response that will change that, so I always try to use the one that makes it easier for the client (well-intentioned).

And in the case of instead of a parameter in the invalid URL, a parameter in the body, for example, a route that needs two parameters, oldPassword and newPassword, but the requisition only has newPassword?

Same situation. Will your application know what to do when you only have one of the values? If yes, the request is valid. If not, the application has nothing to do with the request, then respond with 400 Bad Request.

Remember that it is not rude to tell the client that he is wrong. If he made a wrong request, let him know so that he can correct.


Another way to analyze the problem is by checking the definitions of each answer:

400 Bad Request

The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.

Translating, the request may not have been understood by the server due to the badly formatted syntax. The client MUST NOT repeat the request without modifications.

That is, if I access the resource today /user/null and get the answer 400, I’ll know that at all times that I make this same request I will have the same answer. If there is a possibility that tomorrow (or another time) the resource exists, then the answer should be 404, not 400. Ah, but what if one day I want to create this feature, then should I already use the 404 answer? No. You answer as your application is today. The day you create the new feature you change the answer.

404 Not Found

The server has not found Anything matching the Request-URI. No Indication is Given of whether the condition is Temporary or Permanent. The 410 (Gone) status code SHOULD be used if the server Knows, through some internally configurable Mechanism, that an old Resource is permanently unavailable and has no Forwarding address. This status code is commonly used when the server does not Wish to Reveal Exactly Why the request has been refused, or when no other Response is applicable.

Translating, the server did not find results for the requested URI. No indication is given as to whether the condition is temporary or permanent. The 410 (Gone) response should be used if the server knows somehow that the resource existed and was permanently removed without any alternative address. This response is commonly used when the server does not wish to reveal the actual reason for the request to be refused.

In this case, reply with 404 when accessing /user/null does not mean that one day this resource will exist. It merely indicates that today it does not exist and that there is the possibility of existing one day. If the feature existed and was deleted, the application can respond with 410 (this is common in applications that perform the soft-delete).

Also as previously commented, the 404 answer is more generic and can be used on different occasions, either to avoid revealing information about the application or because there were no better answers to the situation.


Another question that is quite common is how to differentiate if the route does not exist or if the record does not exist when receiving the answer 404. If I access /user/1 and I get the answer 404 means I should try another route, like /usuario/1, or my route is right and is id 1 that does not exist in the bank?

To make this differentiation it is common to use the body of the reply sending a message with details about why it was generated.

HTTP/1.1 404 Not Found
Content-Type: application/json; charset=utf-8

{"error": "Rota não encontrada"}

Or

HTTP/1.1 404 Not Found
Content-Type: application/json; charset=utf-8

{"error": "Usuário 1 não encontrado"}

You can even change the description of the answer (Sponse) freely, but it is not so much dependent on it to indicate the error, for it is much easier to work with the body of the answer than its description. If the client is something more visual to the user, like Postman, it might be interesting to use it:

HTTP/1.1 404 User Not Found
Content-Type: application/json; charset=utf-8

{"error": "Usuário 1 não encontrado"}

This can make it easier for the user as it does not require them to analyze a possible JSON in the body of the response.

inserir a descrição da imagem aqui

  • Answered 250% the question :)

  • @woss, in the case of sending only newPassword and not oldPassword, would not be a 422 Unprocessable Entity?

1

No, this behavior is correct, this route you reported does not exist, I made a quick script in JS to exemplify this, where I test the regular expression you are using to validate the route:

paths_test = ["/user/17", "/user/null"]

paths_test.forEach(x => {
  console.log(x.match('[0-9]+') || `Rota não encontrada: ${x}`);
})

The path Parameter null not match with your route /user/{id}, it will return 404 because it has not found any route that meets the request /user/null since the only existing one validates whether the parameter is numerical through a regular expression.

About the second question is related to the first right? So I think the same logic applies.

  • Your answer seems confusing to me, it doesn’t say whether or not it’s right, it sounds like you’re saying "whatever"

  • @Guilhermecostamilam See if it’s clearer.

  • From what I could understand, you only tested for the existence of digits in the string. I believe a better validation was x.match('/[0-9]+$'), where I guarantee that the numbers must exist after the bar. Even so, it was not very obvious in my lay head the relationship between this match and the issue...

0

I would consider a syntax error: 400 - Bad Request.

If you are searching for users with a certain identifier number, the number null is not a valid number. Therefore, this is considered a "bad request" (bad request), that is, the client did not observe the correct form of the parameter passed and he must make the necessary correction before trying again.

The 404 - Not Found does not make sense here, because the identifier user null is an invalid identifier before to be considered a user who does not exist (which in some ways is also true).

And in the case of instead of a parameter in the invalid URL, a parameter in the body, for example, a path that needs two parameters, oldPassword and newPassword, but the request only has newPassword?

Excellent question. If both information is required, you can also return a 400 - Bad Request. If I’m not mistaken, this is the default behavior of the Spring framework when the body of request are noted with @NotNull and the customer did not fill them out.

Otherwise, you would also make the following change to your endpoint: instead of using /user/ use /users/.

  • 1

    A reason to return 404 would be to try some form of security by obscurantism

  • @Jeffersonquesado, could be a motive. Some solutions may vary depending on the context, so much so that these discussions of which HTTP status is correct in certain situations exist in droves. What I finally learned is that the most important thing in an API is to maintain, regardless of the decision made, the same standard.

0

The error 404 is an HTTP response code indicating that the client was able to communicate with the server, but the server could not find what was requested. In your case it was possible to communicate with the server but could not find what you are looking for that is a specific URL (specific page) , that is, the 404 return is correct because it does not find its route (page not found).

The behavior is correct but the logic is wrong. An identifier cannot be null, so you should already bring only valid ID’s to mount your URL, because the ID is a parameter to mount your route (at most validate that the id is not null as well as cannot be different from numeric). In other words, it makes no sense to mount the page URL of a user that does not exist. If you create a table in the database and perform an Insert the ID field as the primary key is set to 1 instead of 0 or NULL.

There are contexts where we want to look for values where null fields can come, but it’s not yours. If it is a mandatory parameter to form an instruction and it comes from a query, an array, or an informed field, it cannot come null.

  • What if the URL was mounted by an attacker? It wouldn’t be him, then, who assembled the URL, but it would be grammatically valid in what concerns Urls

  • I don’t understand your comment. What does "mounted by an attacker" mean? I am calling attention to "if there is no user id it is not possible to mount the url". Soon he already has to bring all valid id’s so that all routes are mounted without fail.

  • 1

    In this excerpt "you should already bring only valid ID’s to build your URL". Here is saying that who will mount the URL is the author. However, this is usually not true. Here are some more interesting questions from Guilherme, such as HTTP methods in practice, return of OPTIONS, method DELETE possess body and get the textual communication of an HTTP call. [1/3]

  • 1

    In all of them Guilherme is going beyond the typical browser requests (GET/POST) for a more semantic approach, where the HTTP method matters so that it can make a special treatment. In particular, the HTTP methods in practice sleeps at my bedside every day so I can make a good API at work the next day. And, like methods outside the GET and of POST are not trivially accessible by the browser, its client is an application. Therefore, it does not even make sense to speak in "page". [2/3]

  • 1

    As it is accessed by an application, a user who wants to obtain sensitive data can try to pass through the application in question by doing the HTTP request verbatim, for example. [3/3]

  • 1

    On routes, why should it carry all of them? Usually this takes more memory. It’s enough for someone to identify, if they are /user/{id} for a {id} arbitrary, return the result of a computation passing this {id} as an argument

  • All 4xx responses imply that the communication with the server was successful and that it failed due to the error by the client, so much so that they are called "client errors". I do not understand how this can be an argument to say that the correct is the 404. I also didn’t get it when you comment on "assemble the URL"; it seems that you are assuming that only the Urls present on the page will be accessed and none will be manually accessed or generated from an invasion script looking for application flaws. Would you disagree with anything in my answer?

  • I get what you mean. It has the script that mounts the URL so much that it validates whether the ID field is numerical. If it does not want to load all valid ID’s it is only in the part of the code that it validates if it is numeric add that it cannot be null either. In relation to the person entering a malicious code that is why he makes the validation that it is numerical and to not present the error 404, he must avoid that the id is different from null (so there will be route). I informed in my reply too.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.