There are a few ways to do Parsing on files, and each one works for a purpose.
i. The simplest is open user o, which is native to Python.
my_file = []
openfile = open('file.txt')
lines = openfile.readlines()
for i in lines:
temp = i.split(sep="|")
my_file.append(temp)
**The readlines method is very good because it returns a Chunk with all lines, from there you can turn into list or any other type. If it is a large file, you will need to read the lines one by one.
ii. Another option is to use Numpy, which returns an array, which enables various functionalities with respect to number manipulation.
import numpy as np
my_np_file = np.loadtxt('file.txt',delimiter = '|', dtype=str)
iii. A third would be the Pandas, which has already been described above. It is very good because you can work with time series and row and column indexing.
**Numpy and Pandas have problems with very large arrays (~ > 2gb ), so if this is the case, you may have to choose from other more specific solutions for this file size.
You can use the pandas library to do this. You wouldn’t need to change the file type to perform filters and everything else, only at the end you would choose the output.
– Tmilitino