For the record, there is no To regular expression to validate Urls and guilt is in part/are of Rfc(s). And so is for any data that relies on a.
The filtering functions of PHP even follow the proper specifications, but they do not cover all cases and, for others, in order not to avoid false-positives, it decreases the restriction according to your need, through the configuration flags, allowing you to have the flexibility needed for each case.
Only for future references, by default, if omitted the second argument, it only treats the data as a common string.
In your case, given the absence of the form of use, I imagine you’re doing this:
filter_var( 'http://www.youtube.com', FILTER_VALIDATE_URL );
The first URL validates because it contains the main elements of a URL which are the schema, the domain and the TLD.
In the second case it also validates because it also has the three basic Components, even if one of them is wrong.
So that the second URL also returned FALSE would need to match the first flag with FILTER_FLAG_SCHEME_REQUIRED.
The third URL is valid for the user, for the browser, but not for RFC because it lacks one of the basic components required by the specification.
What you could do is, like everything that comes from the user, before you even validate it, is sanitize the URL. A few things that occur to me:
- Check if there are no schemas broken, as in the second URL and fix them, either by removing or fixing when and if possible
- Add the schema pattern http:// at the beginning of the missing URL (or qebrado and now removed), after all, an FTP or HTTPS URL (or ED2K, Magnet, torrent...) that does not have such specific prefixes will not be treated as special anyway.
And always warn the user through a tip on GUI that the format is http://domain.com. If he type wrong, the system fails to fix the check fails, warned he went and will have to fill it up again.
Good class this, I will do a close reading and then comment on the pc. I’m using the cited flag... I forgot to put it in the question. thanks
– Papa Charlie
+1... I use JS to make the mask, but the validation is on the server side and returns the messages back... With so many possibilities, sanitize the URL input, you wouldn’t have to know all the rules to make no mistake?
– Papa Charlie
As a programmer, I am quite pessimistic. I always think of the worst possible scenario and at the end of these scenarios there is always a user (>.<). If term and small performance losses are not a problem for you it is worth focusing on the RFC 3696 Section 4.2 (unless mistake) and others that it derives if the case is and try to cover as much of the cases as possible and if you (your program) can not solve by itself, unfortunately return the error to the user, do what.
– Bruno Augusto