The preg_replace is cutting letters with accent

Asked

Viewed 236 times

1

$slug = preg_replace('/[^a-z0-9]+/i', '-', trim(strtolower($_POST["titulo"])));

Example:

I send: Esse é um titulo de uma página

He returns: esse-um-titulo-de-uma-p-gina

As you can see, you’re cutting the letters with an accent.

As it should be: esse-e-um-titulo-de-uma-pagina. Where is the problem?

2 answers

5

Your range in the [...] this one of a-z and 0-9, a until z only consider the letters of the "alphabet" and not its variations, working with accents in Urls Slug from my point of view is a bad idea, it would be more interesting to associate Slug to an ID and url to be merely illustrative and without accents, which can be solved with functions like iconv PHP, but this is my opinion.

Modern browsers today believe they work with UTF-8 in their urls, so maybe use the range \u00C0-\u00FF function (as in Renan response: /a/15741/3635):

$slug = preg_replace('/[^a-z0-9\u00C0-\u00FF]+/ui', '-', trim(strtolower($_POST["titulo"])));

Some old browsers do not work with Unicode, so maybe it is better to use the first suggestion, without accents, but this is a story beyond, maybe in another question I will come to detail how to do something like this

You can even use the answer from @bfavaretto: /a/15740/3635 has to keep in mind that the accents may not work, because if the document .php is saved with ANSI or iso-8859-1/windows-1252 so it will fail for sure, if you still want to use as in the answer of @bfavaretto I recommend that you read this answer in complete calm:

After reading and understanding how to use utf-8 properly you can try this:

$slug = preg_replace('/[^a-z0-9áàâãéèêíïóôõöúçñ]+/i', '-', trim(strtolower($_POST["titulo"])));
  • I did all your instructions that are on the link about utf-8, and did not resolve this error: Úm titulo página and the Slug is -m-titulo-p-?gina. I also tried to use the \u00C0-\u00FF but I got a mistake: Warning: preg_replace(): Compilation failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 9 I saw here in the O.R..

  • @Natalie sorry, I forgot the flag u, change this /[^a-z0-9\u00C0-\u00FF]+/i for this /[^a-z0-9\u00C0-\u00FF]+/ui

  • I will accept your tip and use ID in the URL, I followed all your instructions but did not solve the problem.

  • @Natalie didn’t solve how? This is vague to understand, describe what happened of failure.

  • I continue with the same mistakes I described in the first comment I made in your reply.

  • @Natalie, please show me exactly what you did, because I posted more than one code. The way you talk I can’t see or understand which of the codes failed, whether it was one or all, or if you did something wrong.

  • First I read your reply on utf-8 on this link: https://answall.com/a/43205/3635, and I used your code: $slug = preg_replace('/[^a-z0-9áàâãéèêíïóôõöúçñ]+/i', '-', trim(strtolower($_POST["titulo"])));, but Slug didn’t turn out as expected: Úm titulo página = -m-titulo-p-?gina. Then I tried to use the preg_replace('/[^a-z0-9\u00C0-\u00FF]+/ui', but when I send input data to DB shows an error: Warning: preg_replace(): Compilation failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 9.

  • @Natalie but it didn’t work '/[^a-z0-9\u00C0-\u00FF]+/ui'?

  • No, you showed the error: Warning: preg_replace(): Compilation failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 9.

  • @Natalie vc added the u after the / or not?

  • I didn’t understand, I put the code that’s in your answer. $slug = preg_replace('/[^a-z0-9\u00C0-\u00FF]+/ui', '-', trim(strtolower($_POST["titulo"])));

  • @Natalie two questions: 1. which version of PHP? 2. with /[^a-z0-9\u00C0-\u00FF]+/ui "Compilation failed: PCRE error occurs"?

  • 1 - php 7.2 ,2 - Yes.

Show 8 more comments

-3

preg_replace: Perform a search for a regular expression and replace it.

So basically, what you’re going through is, '/[^a-z0-9]+/i' anything that is not within that scope, replaces with '-'.

The right thing would be '/[^a-zà-ú0-9]+/i':

$slug = preg_replace('/[^a-zà-ú0-9]+/i', '-', trim(strtolower($_POST["titulo"])));
  • Dear Jonathan want to forgive me, the downvote is mine, not for bad, all my downvotes is when there really is a technical problem in an answer, is the case of yours. Accents have problems because there are several encodings, being the two common in web utf-8 and windows-1252, the way it did there are no guarantees to work, Aliais can fail very easily, before working with accents it is necessary that the person understands how the codecs work [...]

  • [...] I hope you understand that this is a technical rather than a personal assessment and the downvote is an indication that this solution is not ideal and may cause problems for those who come to this question. I wish you a lovely afternoon.

  • 1

    Thanks for the comment, and feedback in this way has no way to take it personally, I will try to improve my future responses. Vlw

  • 2

    Thank you Jon, believe me I always comment more or less like this, but even so there are users who have already taken very badly, even some certain candidates to moderate the current election.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.