How to add specific intervals of several lines?

Asked

Viewed 58 times

1

I have a file with several lines and I want to add a certain range considering all lines. To simplify, I took as an example:

AAA0000011111000090011
BBB0000011111000080011
CCC0000011111000070011

There are several more lines and fields too, but I know the interval I want to add. If you want to add up "00009", "00008" and "00007" which start in heading 14 and end in heading 18, what is the most efficient way? I also want to add "0011", "0011" and "0011". I want to write the fields in a new file.

The result would be a new file with the sums:

24
33

I thought as follows:

file_name = teste.GS3
arquivo = open(file_name, "r")
TOTAL_CHARGEABLE_UNITS = 0
DATA_VOLUME_OUTGOING = 0

i = True
for line in arquivo:
    if i: # para pular a primeira linha
        i = False
        continue
    TOTAL_CHARGEABLE_UNITS = TOTAL_CHARGEABLE_UNITS + sum(line[14:18] + ...)
    DATA_VOLUME_OUTGOING = DATA_VOLUME_OUTGOING + sum(line[19:22] + ...)
arquivo.close()

arquivo = open("/dados/cdrs-roaming/resultado.txt", "w")
arquivo.writelines([TOTAL_CHARGEABLE_UNITS],[DATA_VOLUME_OUTGOING])
arquivo.close

TOTAL_CHARGEABLE_UNITS and DATA_VOLUME_OUTGOING are the desired fields in this case. I am beginner so I had no idea how to create this sum. Any idea?

Detail: I believe I should convert these fields/ranges to float, but don’t know how to do this without converting the entire file.

  • 2

    A business that can help you is, from each line, create a high level object. There, with a list of these objects you operate on their properties. It’s not the best option if it’s something you need to do once and never look again, but if this data is the basis of a larger project, then it might be worth Take a look at this answer: https://answall.com/questions/399778/como-extrair-as-informa%C3%A7%C3%b5es-de-um-file-cnab-using-python/400033#400033

2 answers

4


As you are going through one line at a time, you will only have one value each iteration, so it makes no sense to use sum. Just update variables with each iteration:

total_chargeable_units = 0
data_volume_outgoing = 0
with open('arq.txt') as arq:
    # se quiser pular a primeira linha
    next(arq)

    for linha in arq:
        total_chargeable_units += int(linha[13:18])
        data_volume_outgoing += int(linha[18:22])

If you want to skip the first line, just call next on file before the for (and do not use the return for anything, so the first line will be ignored).

I converted the values to int because they are integers (I saw no need to use float). And within the loop i update the sums. Note that in Slice i put 13:18 (because the first position is zero, then the fourteenth character will be at position 13, and the last position - in this case, 18 - is not included).

Then just write to the results file:

with open('resultado.txt', 'w') as out:
    out.write(f'{total_chargeable_units}\n{data_volume_outgoing}')

Both to read and to write the files I used with, that ensures that the file is closed at the end (so you don’t need to call close).


If you want, you can also generalize, creating a dictionary with the names of the variables and their respective positions:

# guarda as posições inicial e final de cada variável
posicoes = {
    'total_chargeable_units': (13, 18),
    'data_volume_outgoing': (18, 22)
}

results = {}
with open('arq.txt') as arq:
    # se quiser pular a primeira linha
    next(arq)

    for linha in arq:
        # lê todas as posições e atualiza o valor das variáveis
        for variavel, (inicio, fim) in posicoes.items():
            try:
                results[variavel] = results.get(variavel, 0) + int(linha[inicio:fim])
            except ValueError:
                # se não tiver um número, mostra mensagem de erro
                print(f'Valor nas posições [{inicio}:{fim}] não é um número')

# grava tudo no arquivo
with open('resultado.txt', 'w') as out:
    for qtd in results.values():
        out.write(f'{qtd}\n')

I also included a validation if the file does not have a number in those positions.

The result will be another dictionary, containing the names of the variables and their respective totals.

  • It helped a lot! I have not changed according to your second suggestion, but from your idea came the code I left in the answers

0

Based on my colleague’s reply hkotsubo:

It helped a lot! I have not changed according to your second suggestion, but from your idea came up:

sequencia = ""
imsi = ""
msisdn = ""
operadora = ""
total_chargeable_units = 0
data_volume_outgoing = 0
gprs_apn_point = ""
with open('teste.GS3') as arq:
    # se quiser pular a primeira linha
    next(arq)

    for linha in arq:
        if imsi == linha[3:18]:
            msisdn = linha[188:209]
            operadora = linha[25:30]
            total_chargeable_units += int(linha[147:159])
            data_volume_outgoing += int(linha[159:171])
            gprs_apn_point = linha[237:300]
        else:
            sequencia = sequencia + imsi + ' ' + msisdn + ' ' + operadora + ' ' + str(total_chargeable_units) + ' ' + str(data_volume_outgoing) + ' ' + gprs_apn_point + '\n'
            imsi = linha[3:18]
            msisdn = linha[188:209]
            operadora = linha[25:30]
            total_chargeable_units = 0
            data_volume_outgoing = 0
            gprs_apn_point = linha[237:300]
            

    sequencia = sequencia + imsi + ' ' + msisdn + ' ' + operadora + ' ' + str(total_chargeable_units) + ' ' + str(data_volume_outgoing) + ' ' + gprs_apn_point + '\n'
            
with open('resultado.txt', 'w') as out:
    out.write(sequencia)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.