Finding students without a class - Python / Excel

Asked

Viewed 422 times

1

I have a spreadsheet in Excel with all my students and their respective grades (5th to 9th).

I need to find all my students who aren’t in any class. Through the Excel filter I can do, but I have almost 5 thousand registered students and I would like to find a more automatic and quick way to do this work.

As I am starting to learn Python, I would like help to come in this language if possible.

I tried to use a for loop but I guess I didn’t do it right because it only brings the first name and the first grade of the student (almost like a PROCV excel).

Exemplo

In the example image, the student Gabriela would be one of the students that I would have to filter, because she is no longer in any class of any grade. The others have already completed some grades and/ or lack to complete others, so at some point these students have their enrollment in effect.

From now on, I appreciate all your help! ;)

--

So far, I’ve written the following code:

aluno = ["Pedro", "Gabriela", "Aluisio", "Deborah"]
matricula = ["Vigente", "Sem turma"]

for n in aluno:
    if matricula == "Vigente":
        print(f"Curso de {aluno} em andamento.")
    else
        print(f"Curso de {aluno} concluído.")
  • 1

    Post the Python code you’ve already written so we can help you with your difficulties.

  • Thanks, Augusto! I just edited the original post with the code I wrote so far.

  • This code is not able to manipulate data in Excel. At least you would have to use one of the libraries listed in this page. Or if you’re using Ironpython you’d have to have the line import clr clr.AddReference("Microsoft.Office.Interop.Excel").

2 answers

2

Import the spreadsheet


    # -*- coding: utf-8 -*-
    #importar biblioteca
    import pandas as pd
    #importar arquivo excel 
    dados = pd.read_excel('dados.xlsx', sheet_name='Planilha1')
    #filtro
    df1 = dados.loc[dados['Matricula'] == "Vigente"]
    print(df1)

      Aluno  Serie Matricula
2     Pedro      7   Vigente
11  Aluisio      6   Vigente
15  Deborah      5   Vigente

2


In addition to importing the worksheet to a python structure, preferably a DataFrame, as Isaac suggests in his answer, it is also necessary to iterate the data to extract only the "students without class". A way to do this:

import pandas as pd

# Importando a planilha para um pandas DataFrame
df = pd.read_excel('plan1.xls', sheet_name='Sheet1')

# Organizando os dados para o contexto
data = {}
for _, row in df.iterrows():
    if list(row)[0] not in data:
        data[list(row)[0]] = [list(row)[2]]
    else:
        data[list(row)[0]].append(list(row)[2]) 

# Selecionando os alunos "sem turma"
sem_turma = [aluno for aluno in data if 'Vigente' not in data[aluno]]

print(sem_turma)    

Exit:

['Gabriela']

Edited
Another more compact way to achieve the same result:

import pandas as pd
# Importando a planilha para um pandas DataFrame
df = pd.read_excel('plan1.xls', sheet_name='Sheet1')

# Selecionando os nomes dos alunos em um set
alunos = set(list(df['Aluno']))

# Selecionando os alunos "vigentes" em um set
vigentes = set(df.loc[df['Matricula'] == "Vigente"]['Aluno'])

# Selecionando os "sem turma"
print(alunos-vigentes)

Exit:

{'Gabriela'}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.