The first thing to keep in mind is that: a CSV file is not "modifiable" - your program will have to generate another CSV file, with the same template, and the most entered data.
In your example it is easy to understand why you cannot modify the file: the data lines have 60 characters - which correspond to 60 bytes in the file (may correspond to a little more bytes depending on the encoding and the presence of special characters, but it is not the case now). When adding a fourth column, the row size is 75 characters/bytes - it turns out that in the original file, these 15 extra bytes are already in the next line, So this next line, including the line markers, would be over-written by the extra data. Two lines would merge into one, part of the data would be lost, etc...
On the other hand, creating a new file is very smooth and efficient - you can use CSV files peacefully in operations with up to hundreds of thousands of records, as long as they are not done frequently (e.g. once per request in a web application). Of course from a certain point it is better to keep everything in database.
What is not so direct, from Python, is to generate the CSV file as "cute" as those of the examples, with the data centered on the columns, etc... but it is possible also (a tip in this case is to use the library Rows, and generate "text tables" and not in CSV).
This constraint on formatting is easily manageable as well, if it is part of the requirements - simply format each field with the string functions when writing the file.
Reading and writing the CSV file as direct text is very easy in Python - but it is interesting to note the module csv that can handle "corner-cases" automatically: if the separator character appears inside a text cell, or if some cell contains the character "
, normally used to delimit text, or even the word wrap character \n
.
The algorithm itself, to detect the zeros then is trivial - just keep a counter, zeroed at each occurrence of a non-zero value at the desired position, and incremented at each new line.
As it is easier to understand, I will give example of a version opening the files directly, without the module csv
python. But in production it is interesting to look at the same, especially if at any point in the future the archives may have some of the "pathogenic" cases I quoted in their lines.
from pathlib import Path
def main(filepath):
inp_path = Path(filepath)
out_path = Path(filepath + "_new")
with inp_path.open() as inp_file, out_path.open("wt") as out_file:
headers = next(inp_file).strip().split('|')
new_headers = headers[:] + 'Análise'
# O seu arquivo de exemplo tem os títulos das colunas todos bagunçados:
# em alguns casos é centrado com um espaço de folga, a data
# é centrada com vários espaços de folga, algumas colunas são alinhadas
# a esquerda, outras centradas, e outras com alinhadas a esquerda com
# espaços. Como não é possível colocar isso no código sem as coisas
# específicas de cada coluna, pode valer a pena simplesmetne
# escrever o cabeçalho todo como uma string fixa. Isso é bem pouco elegante,
# mas seria:
# out_file.write("Index | Ponto de Medição | Data | Hora |ZeroApply |Análise\n")
# melhor é abrir mão da formatação maluca para cada coluna, e usar, por exemplo,
# nomes centralizados em 12 espaços (o que permite manter a formatação da data como está)
out_file.write("|".join(header.strip().center(12) for header in headers) + "\n")
zero_sequence_counter = 0
for line in inp_file:
fields = line.split("|")
zero_apply = int(fields[-1].strip())
verify_field = ""
if zero_apply == 0:
zero_sequence_counter += 1
if zero_sequence_counter >= 3:
verify_field = "Verificar"
else:
zero_sequence_counter = 0
fields.append(verify_field)
out_file.write("|".join(field.strip().center(12) for field in fields) + \n)
# opcional: remover o arquivo original e renomear o novo arquivo:
inp_path.unlink()
out_path.rename(inp_path)
main("nome_do_arquivo.csv")
One detail of this example is that I make extensive use of the class pathlib.Path
for all file-related operations. As it is a relatively recent feature in the language, almost every example you find of Python will manipulate arches without using it - using string manipulation functions for the filename, and the open
built-in to open a file.
Other than that, what you have there is the use of methods .split
, .center
, strip
and .join
of strings, and +
to concatenate strings - and int
to make use of the column as a numerical value - but nor would this conversion be necessary. The algorithm itself, as described above, is quite simple.
You can put the csv link?
– Wictor Chaves
Hello Felipe. The problem is clear. But, what have you tried to do? What is your difficulty? You have a better chance of getting help if you explain your difficulty and don’t expect someone to just do it for you.
– Luiz Vieira