TXT OF BANK EXTRACT (RELEASES)

Asked

Viewed 1,290 times

0

You guys, good morning.

I have the following problem.

I have some txt-like bank statements files that I need to play on for a dataframe, until they have a sort of standard layout. I need to take only the lines of the accounting releases (lines that have dt.movement the rest can disregard), I tried a little here and even managed to get the lines with the command line.split(' t') but they are not separated by tabulation (TAB)

I wonder if anyone can help me with this problem.

(Example Bank Statement) Exemplo Extrato Bancario

TXT FILE

================================================================================================================================================================
Extrato conta corrente                                                                                                                       5152152141421124144
                                                                                                                                             11/12/2018 15:57:30
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Cliente - Cliente atual
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Agência                       3214-1
Conta corrente                2341-1 XPTO ENTERPRISE
Período do extrato            
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Lançamentos
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Dt. movimento  Dt. balancete    Ag. origem        Lote     Histórico                                    Documento         Valor R$            Saldo
   29/11/2018                       0000          00000     000 Saldo Anterior                                                            9.104,88 C
   05/12/2018                       0000          13113     435 Tarifa Pacote de Serviços      432.432424.24234.43         128,00 D       8.976,88 C
                              Tarifa referente a 05/12/2018
   10/12/2018                       0000          14175     976 TED-Crédito em Conta                     7.962.643      15.000,00 C      23.976,88 C
                              341 0192 56998438000165 EMPRESA ABC
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    12.068       1.264,91 D      22.711,97 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    23.068       1.264,90 D      21.447,07 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    23.068       1.264,91 D      20.182,16 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    51.068       1.264,91 D      18.917,25 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    21.068       1.264,91 D      17.652,34 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação               	   543.068     	 1.264,91 D    	  16.387,43 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    23.068       1.264,91 D      15.122,52 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    31.068       1.264,91 D      13.857,61 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    74.068       1.264,91 D      12.592,70 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    23.068       1.264,91 D      11.327,79 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    61.068       1.264,91 D      10.062,88 C
   10/12/2018                       0000          13013     364 BB Consórcio - Prestação                    23.068       1.264,91 D       8.797,97 C
   11/12/2018                       0000          00000     999 S A L D O                                                8.797,97 C       8.797,97 C
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Saldo                                                                 8.797,97 C
Juros                                                                       0,00
Data de Debito de Juros                                               31/12/2018
IOF                                                                         0,00
Data de Debito de IOF                                                 02/01/2019
----------------------------------------------------------------------------------------------------------------------------------------------------------------
               ------------------------------------------------
               ------------------------------------------------
               OBSERVAÇÕES:
               ------------------------------------------------
               
               

----------------------------------------------------------------------------------------------------------------------------------------------------------------
Transação efetuada com sucesso por: 234252 GENESIO MARINHO

  • Most banks in Brazil, if not all, offer extracts in format .ofx (Open Financial Exchange), the best way is to download in this format and use an appropriate tool to extract the information, for example the ofxtools.

  • By the way... The format ofx is an m#rda (an expression of even more macorronic xml) but it is the standard of the banks (perhaps imposed by the Quiken data standard), and since it had a "good soul" to parse for python, because it does not take advantage of it?

1 answer

0


It seems that the columns are fixed. Create a dictionary with the positions of each field within the row:

coordenadas = {
    'dt_movimento': (3, 13),
    'dt_balancete': (18, 28),
    'ag_origem': (36, 40),
    'lote': (50, 55),
    'cod_historico': (60, 63),
    'historico': (64, 100),
    'documento': (101, 114),
    'valor': (115, 129),
    'valor_dc': (130, 131),
    'saldo': (132, 146),
    'saldo_dc': (147, 148),
}

Then it’s easy, just split each line and generate a record:

with open('extrato.txt') as f:
    df = pd.DataFrame({nome_campo: linha[inicio:fim] 
        for nome_campo, (inicio, fim) in coordenadas.items()}
        for linha in f
        if linha[5:6] == '/')

The criterion used here is the bar / date in fifth position.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.