Setting parameters and filtering file . TXT

Asked

Viewed 356 times

0

I have a file with a tab | in file . txt, would like to turn it into file .xls and then work with it by filtering and replacing values in python3.

And possible ?!

Follow a template of the file ". txt":

||0000 005|0|||01072019|31072019|MODELO|15684294000195|SP|5107040||00|2|

  • 1

    You can use the pandas library to do this. You wouldn’t need to change the file type to perform filters and everything else, only at the end you would choose the output.

2 answers

1


To work with tabulated data you can use the pandas libraries.
use the command below to install it, if you do not have.

python3 -m pip install --upgrade pandas

And as an example:

import pandas as pd

tabela = pd.read_fwf("seuArquivo.txt", delimiter="|")
tabela.to_excel('salvarArquivo.xlsx', 'Sheet1')
  • Thanks Murilo, what tells me to play an EXCEL formula in the script so that when converting the file . TXT already come edited in the output file ?

  • @Ivanalmeida did not understand, you speak of putting a formula in your txt file, so at the time that convert to spreadsheet already come with the formula? It’s a good idea, I’ve never particularly needed to use it so, in the theory of right, you could take a test! Note: if the answer helped you do not forget to mark as correct to help others with the same question

  • Follow the step by step I do without the script.

  • 1 - Import the file . TXT to EXCEL defining the delimiter " | ". 2- I check information that is not relevant in the file converted to EXCEL. 3- After the indentification I use formula in excel to sub-specify by relevant data. 4-Save the file and send to a validation system in which it will check if it is in the proper format of reading and compiling. 5- If he is, he will accept and validate without errors. NOTE: Apologies for accents, I use keyboard in USA mode and apologies for forgetting to mark as correct !

  • Therefore, as standard Murilo would like the Pyhton script to do this automatically without having to export/import. Convert, Edit, Check if the invalid data (if any, subistituisse by valid data [ that I would add ] using the formula I have), and present in the output file. If possible Murilo any idea is valid. Thank you !

  • Note: It does not necessarily have to be in Python : )

  • @Ivanalmeida understood, from to search for this invalid data and already treat before converting to Excel, but to create this validation would need you to pass an example of the invalid data and why you replaced them

Show 2 more comments

0

There are a few ways to do Parsing on files, and each one works for a purpose.

i. The simplest is open user o, which is native to Python.

my_file = []
openfile = open('file.txt')
lines = openfile.readlines()
for i in  lines:
    temp = i.split(sep="|")
    my_file.append(temp)

**The readlines method is very good because it returns a Chunk with all lines, from there you can turn into list or any other type. If it is a large file, you will need to read the lines one by one.

ii. Another option is to use Numpy, which returns an array, which enables various functionalities with respect to number manipulation.

import numpy as np
my_np_file = np.loadtxt('file.txt',delimiter = '|', dtype=str)

iii. A third would be the Pandas, which has already been described above. It is very good because you can work with time series and row and column indexing.

**Numpy and Pandas have problems with very large arrays (~ > 2gb ), so if this is the case, you may have to choose from other more specific solutions for this file size.

  • Thank you David, what tells me to manipulate it to stay in EXCEL file and in python script play an EXCEL formula to be edited as soon as converted ?

  • Dude. I would tell you to keep the python as long as you can. If you are going to mess with Excel anyway, it would compensate you to use VBA. Notably, there’s a learning curve using Python to do these, things. However, using excel in the middle you will (i)break the workflow (if you need to redo it will be boring) and (ii)Stop using Python tools, which are very efficient for this. If the learning curve is a problem, I recommend using the Spyder Bugger. It has variable inspection, and saves a lot of patience for those who are developing this type of procedural code.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.