Select desired value with REGEX for two different standards

Asked

Viewed 72 times

2

I have the following data entry:

INSTANCE-hostname:Sys
INSTANCE-hostname-INSTANCE_00:Ins

You will always follow this case, where only what interests me is the value hostname, that is, my OUTPUT should be:

hostname
hostname

I tested with regex [^-]*-([^:]+), but is failing:

REGEX101

PYTHEX

How to get the desired output?

1 answer

4


Your regex fails because you only check the cases in which there are : after hostname. But you also need to check if you have a -:

import re

textos = [ 'INSTANCE-hostname:Sys', 'INSTANCE-hostname-INSTANCE_00:Ins' ]

r = re.compile(r'^[^-]+-([^-:]+)[-:]')
for texto in textos:
    m = r.match(texto)
    if m:
        print(m.group(1))

The bookmark ^ indicates that I will start searching at the beginning of the string.

regex searches for several characters that are not hyphens ([^-]+), followed by a hyphen, followed by several characters that are neither hyphenated nor colon ([^-:]+), followed by a hyphen or two dots ([-:]).

The part I want to catch (right after the first hyphen) is in parentheses to form a catch group. Then I can catch this group using group(1) (i use 1 because it is the first pair of parentheses of regex, and so is the first group).

The exit is:

hostname
hostname

If the part corresponding to "hostname" will always be "one or more letters", you can also use:

r = re.compile(r'^[^-]+-([a-zA-Z]+)[-:]')

This can give different results depending on the strings you use, since [^-:]+ consider any characters that are not - or : (that is, line breaks, numbers, punctuation marks, spaces, etc). But if you use [a-zA-Z], you restrict only to what you need (the ideal in regex is to try, as far as possible, say exactly what you want and what you don’t want).

  • was very good his answer, I could understand exactly where my regex was failing, thank you very much for the help and for having formulated the script in python along with the regex !!

  • 1

    @Luisv. Just remembering that if the answer solved your problem, you can accept it, see here how and why to do it. It is not mandatory, but it is a good practice of the site, to indicate to future visitors that it solved the problem.

  • I will accept yes, I’m just waiting to finish the time, still blocked to accept answers!

  • 1

    @Luisv. Oh yeah, I forgot I had this minimum time to accept... :-)

  • Done, thank you very much !!

  • so a doubt, when I have the following case 'INSTANCE-hostname_INSTANCE_00:Ins', the above regex applies?

  • 1

    @Luisv. No, you’ll have to add the _ in the list: https://regex101.com/r/QHgxpD/1/

  • Perfect, I understood, thank you very much, clarified all my doubts !!

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.