Filter filenames in uppercase and with a particular snippet at the end

Asked

Viewed 151 times

4

I have some files named with person names, but some are completely uppercase, others completely lowercase and some even mixed case.

I would like to set up a regex to, from a list, filter only the filenames that were totally in uppercase, without containing the snippet - Cópia before the extension.

The stretch at the end I can detect with the regex of William’s reply in another question I had asked, but now I need to merge a regex to check if the file name is all uppercase, denying the regex of the linked answer, in case you have the quote.

To demonstrate what I want to do:

EDSON ARANTES DO NASCIMENTO.jpg -> passa
EDsON ARANTEs DO NASCIMENTO.jpg -> não passa
EDSON ARANTES DO NASCIMENTO - Cópia.jpg -> não passa
EDSON ARANTES DO NASCIMENTO. - Cópia - Cópia.jpg -> não passa

The regex I’ve done so far was:

^([A-Z]{2,}+).*( - C[oó]pia\.[^.]+)$

but that passes all the above cases. I even found this another answer on Soen but I don’t know how to apply it. How do I adapt this code so that only the first example will pass?

  • Take a look at this question: https://answall.com/q/42172/129 or this https://answall.com/q/143866/129

2 answers

2


The solution I found was this:

^([^a-z]{1,}[A-Z]{2,}+)(?:(?! - C[oó]pia\.[^.]+).)+$

Basically they are two groups, where the first does not allow lowercase letters in any quantity, and only uppercase from 2 characters in a row (strangely only works with this limitation, if remove does not work correctly). The second group denies the regex of william’s response.

The validation can be checked in the regex101.

1

Expresion REGEX

(([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ]{3,}))|([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])

Expression of substitution

\U$2\E\L$3\E\L$4\E

Explanation

We will divide the expression regex into its parts:

1 (
2    ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])
3    ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ]{3,})
4 )|
5 ([A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ])
  1. [A-Za-zÁÀÂÃÉÈÊÍÏÓÔÕÖÚÇÑáàâãéèêíïóôõöúçñ]: character set expanded to accents used in English and some bonus letters.
  2. The first group defined on the lines 1 to 4 above captures all words with 4 or more characters and divides these words into two groups, the $2 with the first letter of the word (which must be uppercase) and $3 with the rest of the word (which should be minute).
  3. The group $4 defined on the line 5 above captures all previously not captured characters (which will belong to words with 3 or less characters)
  4. The expression substitution uses the groups and the special conditions:

    • \U\E: indicates that whatever is between \U and the \E must be more
    • \L\E: indicates that whatever is between \L and the \E must be minute
  • 1

    Accents are irrelevant to my problem, because we don’t use them to rename the files, so I didn’t quote. And this regex there is letting go of all the examples of the question I mentioned that should not pass.

  • Articuno, I’m sorry, but it seems you’ve completely changed the question initially asked, I answered your initial question...

  • 1

    Your answer was posted after the change. When the question has no answers, changing completely is perfectly acceptable and allowed. So much so that I already posted the solution to the new problem. And the initial problem I had already solved in another way, so I took the question.

  • Yes I know, I was writing the answer and I didn’t see the change... I’m just explaining why my answer didn’t answer your current question...

  • Ah, sorry then, but as I had already solved and had no answers, I decided to take advantage of the same for another problem that arose. :/

Browser other questions tagged

You are not signed in. Login or sign up in order to post.