Setting parameters and filtering file . TXT

Question

Setting parameters and filtering file . TXT

Asked 5 years, 2 months ago

Viewed 356 times

0

I have a file with a tab | in file . txt, would like to turn it into file .xls and then work with it by filtering and replacing values in python3.

And possible ?!

Follow a template of the file ". txt":

||0000 005|0|||01072019|31072019|MODELO|15684294000195|SP|5107040||00|2|

1

You can use the pandas library to do this. You wouldn’t need to change the file type to perform filters and everything else, only at the end you would choose the output.

– Tmilitino

2019/08/27 at 12:06

2 answers

1

To work with tabulated data you can use the pandas libraries.
use the command below to install it, if you do not have.

python3 -m pip install --upgrade pandas

And as an example:

import pandas as pd

tabela = pd.read_fwf("seuArquivo.txt", delimiter="|")
tabela.to_excel('salvarArquivo.xlsx', 'Sheet1')

Thanks Murilo, what tells me to play an EXCEL formula in the script so that when converting the file . TXT already come edited in the output file ?

– Ivan Almeida

2019/08/27 at 22:33
@Ivanalmeida did not understand, you speak of putting a formula in your txt file, so at the time that convert to spreadsheet already come with the formula? It’s a good idea, I’ve never particularly needed to use it so, in the theory of right, you could take a test! Note: if the answer helped you do not forget to mark as correct to help others with the same question

– Murilo Portugal

2019/08/28 at 01:04
Follow the step by step I do without the script.

– Ivan Almeida

2019/08/28 at 01:27
1 - Import the file . TXT to EXCEL defining the delimiter " | ". 2- I check information that is not relevant in the file converted to EXCEL. 3- After the indentification I use formula in excel to sub-specify by relevant data. 4-Save the file and send to a validation system in which it will check if it is in the proper format of reading and compiling. 5- If he is, he will accept and validate without errors. NOTE: Apologies for accents, I use keyboard in USA mode and apologies for forgetting to mark as correct !

– Ivan Almeida

2019/08/28 at 01:34
Therefore, as standard Murilo would like the Pyhton script to do this automatically without having to export/import. Convert, Edit, Check if the invalid data (if any, subistituisse by valid data [ that I would add ] using the formula I have), and present in the output file. If possible Murilo any idea is valid. Thank you !

– Ivan Almeida

2019/08/28 at 01:41
Note: It does not necessarily have to be in Python : )

– Ivan Almeida

2019/08/28 at 01:43
@Ivanalmeida understood, from to search for this invalid data and already treat before converting to Excel, but to create this validation would need you to pass an example of the invalid data and why you replaced them

– Murilo Portugal

2019/08/28 at 02:29

Show 2 more comments

Browser other questions tagged python-3.x txt filter xls

You are not signed in. Login or sign up in order to post.

by David Bispo Ferreira • 1 point · Answer 1 · 2019-08-27T13:04:34+00:00

There are a few ways to do Parsing on files, and each one works for a purpose.

i. The simplest is open user o, which is native to Python.

my_file = []
openfile = open('file.txt')
lines = openfile.readlines()
for i in  lines:
    temp = i.split(sep="|")
    my_file.append(temp)

**The readlines method is very good because it returns a Chunk with all lines, from there you can turn into list or any other type. If it is a large file, you will need to read the lines one by one.

ii. Another option is to use Numpy, which returns an array, which enables various functionalities with respect to number manipulation.

import numpy as np
my_np_file = np.loadtxt('file.txt',delimiter = '|', dtype=str)

iii. A third would be the Pandas, which has already been described above. It is very good because you can work with time series and row and column indexing.

**Numpy and Pandas have problems with very large arrays (~ > 2gb ), so if this is the case, you may have to choose from other more specific solutions for this file size.