Separate list of names according to regex notes

Asked

Viewed 219 times

-1

How to resolve this with regex?

The archive assets/grades.txt contains a list of people - one per line - containing your notes. Create a regex to generate a list of students who received grade B.

Filing cabinet assets/grades.txt:

Ronald Mayr: A
Bell Kassulke: B
Jacqueline Rupp: A 
Alexander Zeller: C
Valentina Denk: C 
Simon Loidl: B 
Elias Jovanovic: B 
Stefanie Weninger: A 
Fabian Peer: C 
Hakim Botros: B
Emilie Lorentsen: B
Herman Karlsen: C
Nathalie Delacruz: C
Casey Hartman: C
Lily Walker : A
Gerard Wang: C
Tony Mcdowell: C
Jake Wood: B
Fatemeh Akhtar: B
Kim Weston: B
Nicholas Beatty: A
Kirsten Williams: C
Vaishali Surana: C
Coby Mccormack: C
Yasmin Dar: B
Romy Donnelly: A
Viswamitra Upandhye: B
Kendrick Hilpert: A
Killian Kaufman: B
Elwood Page: B
Mukti Patel: A
Emily Lesch: C
Elodie Booker: B
Jedd Kim: A
Annabel Davies: A
Adnan Chen: B
Jonathan Berg: C
Hank Spinka: B
Agnes Schneider: C
Kimberly Green: A
Lola-Rose Coates: C
Rose Christiansen: C
Shirley Hintz: C
Hannah Bayer: B

My attempt:

import re
def grades():
    with open ("assets/grades.txt", "r") as file:
        grades = file.read()
        #print(grades)
 
    # YOUR CODE HERE
    B_entities = re.finditer("(?P<name>[\w ]*):\s(?P<grade>B$)", grades)
    counter = 0
    for item in B_entities:
        print(item.groupdict(['name']))
        counter += 1
    print(counter)
 
    return B_entities
 
grades()
  • Opa, this community is in Portuguese. I recommend editing the translated question and including the code example you tried, as it is also part of the question.

1 answer

1


You don’t need to read the entire contents of the file at once and then iterate through it with finditer. If each student is in a row, you can read one row at a time (for small files it may not make a difference, but for larger files it will, because read() loads the entire contents of the file into memory).

To read line by line, simply iterate through the file with a for:

import re

r = re.compile('^([^:]+): B')
alunosComNotaB = []
with open('grades.txt') as arquivo:
    for linha in arquivo:
        match = r.match(linha)
        if match:
            alunosComNotaB.append(match.group(1))

print(alunosComNotaB)

In the regex I use the bookmark ^ (indicating the beginning of the line) and then use [^:]+ (one or more characters other than :, so I guarantee I’ll take everything up to two points - I’m assuming the name doesn’t have :). And all this is in parentheses to form a catch group, so I can get that information later, with the method group.

Then I see if the note is "B" (and here I assume that after the : always has a space and then the note). I put the note B directly, because if the note is different, the regex will not find any match and will not enter the if. But I could also have done it in a more generic way to get the note:

# assumindo que a nota pode ser de "A" a "F"
r = re.compile(r'^([^:]+):\s*([A-F])')
alunosComNotaB = []
with open('grades.txt') as arquivo:
    for linha in arquivo:
        match = r.match(linha)
        if match and match.group(2) == 'B':
            alunosComNotaB.append(match.group(1))

I used \s* (zero or more spaces), in case the file has any amount of spaces after the :, and [A-F] to pick up the notes from "A" to "F" (it was just an example, switch to the range that makes the most sense).

But as in this you just want to explicitly take by "B", the first option is simpler. I would use [A-F] if I wanted to get the grade, regardless of the value. But since I only want the ones with a "B", I don’t need this.


And I think in this case, regex is an unnecessary complication. If the format is always this ("name: note"), it seems simpler to me to use split:

alunosComNotaB = []
with open('grades.txt') as arquivo:
    for linha in arquivo:
        nome, nota = linha.split(':')
        if nota.strip() == 'B':
            alunosComNotaB.append(nome)

I used strip() to eliminate spaces at the beginning and end (as I saw that some lines have space after the note).

  • 1

    Thank you so much for the full explanation, I was able to get a sense of the best approaches to the problem. abrç

Browser other questions tagged

You are not signed in. Login or sign up in order to post.