Filter filenames in uppercase and with a particular snippet at the end


Viewed 151 times


I have some files named with person names, but some are completely uppercase, others completely lowercase and some even mixed case.

I would like to set up a regex to, from a list, filter only the filenames that were totally in uppercase, without containing the snippet - Cópia before the extension.

The stretch at the end I can detect with the regex of William’s reply in another question I had asked, but now I need to merge a regex to check if the file name is all uppercase, denying the regex of the linked answer, in case you have the quote.

To demonstrate what I want to do:

EDsON ARANTEs DO NASCIMENTO.jpg -> não passa
EDSON ARANTES DO NASCIMENTO - Cópia.jpg -> não passa
EDSON ARANTES DO NASCIMENTO. - Cópia - Cópia.jpg -> não passa

The regex I’ve done so far was:

^([A-Z]{2,}+).*( - C[oó]pia\.[^.]+)$

but that passes all the above cases. I even found this another answer on Soen but I don’t know how to apply it. How do I adapt this code so that only the first example will pass?

  • Take a look at this question: or this

2 answers


The solution I found was this:

^([^a-z]{1,}[A-Z]{2,}+)(?:(?! - C[oó]pia\.[^.]+).)+$

Basically they are two groups, where the first does not allow lowercase letters in any quantity, and only uppercase from 2 characters in a row (strangely only works with this limitation, if remove does not work correctly). The second group denies the regex of william’s response.

The validation can be checked in the regex101.


Expresion REGEX


Expression of substitution



We will divide the expression regex into its parts:

1 (
2    ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])
3    ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ]{3,})
4 )|
5 ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])
  1. [A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ]: character set expanded to accents used in English and some bonus letters.
  2. The first group defined on the lines 1 to 4 above captures all words with 4 or more characters and divides these words into two groups, the $2 with the first letter of the word (which must be uppercase) and $3 with the rest of the word (which should be minute).
  3. The group $4 defined on the line 5 above captures all previously not captured characters (which will belong to words with 3 or less characters)
  4. The expression substitution uses the groups and the special conditions:

    • \U\E: indicates that whatever is between \U and the \E must be more
    • \L\E: indicates that whatever is between \L and the \E must be minute
  • 1

    Accents are irrelevant to my problem, because we don’t use them to rename the files, so I didn’t quote. And this regex there is letting go of all the examples of the question I mentioned that should not pass.

  • Articuno, I’m sorry, but it seems you’ve completely changed the question initially asked, I answered your initial question...

  • 1

    Your answer was posted after the change. When the question has no answers, changing completely is perfectly acceptable and allowed. So much so that I already posted the solution to the new problem. And the initial problem I had already solved in another way, so I took the question.

  • Yes I know, I was writing the answer and I didn’t see the change... I’m just explaining why my answer didn’t answer your current question...

  • Ah, sorry then, but as I had already solved and had no answers, I decided to take advantage of the same for another problem that arose. :/

Browser other questions tagged

You are not signed in. Login or sign up in order to post.