Delete certain lines from a python . csv

Asked

Viewed 1,701 times

0

What would be the easiest way to delete only certain lines from file . csv within several other?

No . csv below I would like just the lines of the Client (C) Cleiton and its films (F) that in the case would be City of God and Twilight were excluded.

    C;7;Josue;[email protected]
    F;Matrix ;3.5
    F;Avengers;3.1
    C;8;Cleiton;[email protected]
    F;Cidade de Deus;5.0
    F;Crepusculo;3.2
    C;9;Roberta;[email protected]
    F;Avengers;4.5
    C;10;Romulo;[email protected]
    F;Matrix;4.9
    F;Drive;4.0
    F;Tropa de Elite;3.5

After that I would take the data of all other clients and overwrite in the file. I have so far come to that part of the code:

import csv

contador = 0
with open("arquivo.csv", "w") as arq:
    reader = csv.reader(arq, delimiter=';')
    data = list(reader)

critico = input("-> ")
with open("arquivo.csv", "r",) as arq:
    writer = csv.writer(arq, delimiter=';')
    ident = None
    for line in data:
        if line[0] == 'C' and line[1] == critico:
            identificador = int(line[1])

            if line[0] == 'F':
                contador += 1

The code is not complete yet, but what I am thinking of is first to delete the file with the command 'w' and then overwrite with all clients and movies other than those of any user I choose (in this case the Cleiton).

My problem is knowing how many Movies (F) a certain user has evaluated, since a user can evaluate how many movies they want (1,2,3,4....), how to calculate how many movies a user has registered?

1 answer

0

The idea of how to delete lines from CSV file itself is more or less easy to understand - but you used your CSV file in such a different way than the most appropriate way of using it, which - at least I - I went through the question a couple of times without answering just thinking about the hassle of trying to explain the most appropriate way.

But here we go: First - your idea is right: there is no way to "modify an existing file": it is always necessary to create a new file and save all the content desired to the file. Of course in general it is best to do this in 3 steps: (1) open the old file; (2) write the new data in a file with a different name; (3) finally remove the original file, and rename the new file to take its place. These 3 simple steps prevent that if your program stops in the middle by an error of any nature, your data is lost - at any time you have either the original file, or the new file already completed.

From there on the peculiar things of this file start: the normalpara Mum CSV file is that it is a single table, with all the lines having the same structure. In this case, you have created two types of different lines, meaning different things - the simplest, to keep the data in CSV in this case, is to keep all the data in each line (ie: denormalized).

But also thinking: will putting more features on top of a CSV structure solve your problem? If you want to associate "critical" type objects with "movie" type objects, you have a relationship - suddenly a relational database is best for you there.

If this file is never going to get big compared to the memory of the computer you are going to run (think that a modern PC has about 2GB of memory, and the entire Bible in text takes about 3MB - that is, if you have the equivalent of the Bible (~1000 pages in small print) of textual data, will be occupying about 0.2% of the computer’s memory: which means that most likely you may have a program that works all the time with all the data in memory, and only "save" the content when it is convenient. In this case, you have to save in a format that can be read back in another program execution: it can be either an unconventional CSV, or what you have, or it can be a serialized file like "JSON": a syntax similar to the dictionaries we use in Python, which can be read and written directly to a text editor, but which has two advantages in this case: all data can be read from the file, or written to it, in a single function call, and, more importantly: in a JSON file you can preserve the hierarchical structure of your information - and take advantage to use the same structure in memory.

Let’s work with this representation of your data in a list of dictionaries in memory: this will allow you to have a program that does more operations with your objects - and, to keep me on the subject of the question, I will put functions to read your current CSV file to this structure, and save this structure to a format like the one you have:

import csv
import json
import sys


def le_dados_csv(nome_do_arquivo):
    dados = []
    with open(nome_do_arquivo) as arq:
        leitor = csv.reader(arq, delimiter=";")
        critico = None
        for linha in leitor:
            if linha[0] == "C": # Dados de um novo crítico
                critico = {}
                critico["codigo"] = linha[1]
                critico["nome"] = linha[2]
                critico["email"] = linha[3]
                critico["filmes"] = []
                dados.append(critico)
            else: # A informação nesta linha é sobre um filme do último crítico
                filme = {}
                filme["titulo"] = linha[1]
                filme["nota"] = float(linha[2])
                # acrescenta as ifnormações sobre este filme a
                # lista de filmes do último crítico
                critico["filmes"].append(filme)
    return dados


def grava_dados_csv(nome_do_arquivo, dados):
    with open(nome_do_arquivo, "wt") as arq:
        escritor = csv.writer(arq, delimiter=";")

        for critico in dados:
            escritor.writerow(("C", critico["codigo"], critico["nome"], critico["email"]))
            for filme in critico["filmes"]:
                escritor.writerow(("F", filme["titulo"], "{:.01f}".format(filme["nota"])))



def grava_dados_json(nome_do_arquivo, dados):
    """Grava os dados do programa como uma estrutura json"""
    with open(nome_do_arquivo, "wt") as arq:
        json.dump(dados, arq, indent=4, separators=(',', ': '))


def le_dados_json(nome_do_arquivo):
    """Grava os dados do programa como uma estrutura json"""
    with open(nome_do_arquivo) as arq:
        dados = json.load(arq)
    return dados


def remove_critico(dados, nome_critico):
    for indice, critico in enumerate(dados):
        if critico["nome"] == nome_critico:
            del dados[indice]
            break
    else:
        # A clausula else de um "for" em Python é executada se o "for"
        # terminou sem ser por causa de um comando "break".
        print("Critico {} não encontrado".format(nome_critico), file=sys.stderr)


# E por fim, uma função "principal" que faz o que você descreve
# na pergunta  - acreido que fica fácil você ampliar seu programa
# a partir daqui:
def principal():
    dados = le_dados_csv("arquivo.csv")
    remove_critico("Cleiton")
    grava_dados_csv("arquivo.csv")


# Executar a função principal apenas se este arquivo Python
# for executado como programa principal.
# Isso permite que outros rquivos .py possam importar
# este arquivo e usar as funções de leitura e escrita
# normalmente

if __name__ == "__main__":
    principal()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.