How to read a CSV file in Python?

Asked

Viewed 33,522 times

12

I need to read a very large CSV file (+ 300,000 lines).

What is the best way to read a CSV file in python?

4 answers

15


The simplest way to read the file:

import csv
ficheiro = open('ficheiro_csv.csv', 'rb')
reader = csv.reader(ficheiro)
for linha in reader:
    print linha

However it is good practice to open the file this way:

import csv
with open('ficheiro_csv.csv', 'rb') as ficheiro:
    reader = csv.reader(ficheiro)
    for linha in reader:
        print linha

In case the file has an alternative format we must declare the delimiter and whether or not it has skin:

import csv
with open('ficheiro', 'rb') as ficheiro:
    reader = csv.reader(ficheiro, delimiter=':', quoting=csv.QUOTE_NONE)
    for linha in reader:
        print linha

A more developed way, already able to deal with possible errors:

import csv, sys
nome_ficheiro = 'ficheiro.csv'
with open(nome_ficheiro, 'rb') as ficheiro:
    reader = csv.reader(ficheiro)
    try:
        for linha in reader:
            print linha
    except csv.Error as e:
        sys.exit('ficheiro %s, linha %d: %s' % (nome_ficheiro, reader.line_num, e))

csv.Reader() methods only give one line at a time. So you can handle large files.

Some of the examples presented can be seen in detail here.

5

If your csv file is too large, you can use Lib Pandas read_csv method. At first it performs better than standard Python csv.

import pandas as pd
print pd.read_csv('file.csv')

The return is an object of the type Dataframe.

4

See below for how:

>>> import csv
>>> with open('eggs.csv', 'rb') as csvfile:
...     spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
...     for row in spamreader:
...         print ', '.join(row)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam

Import the csv module, open the "Eggs.csv" file for reading, scroll over each line and print, or do whatever you want with the file.

This method is valid for python 2.7 onwards.

0

import csv
import numpy as np


arquivo = open('pessoas.csv')
nome,idade,email=np.loadtxt('pessoas.csv',
                            delimiter=',',
                            unpack=True,
                            dtype='str')


linhas = csv.reader(arquivo)

for linha in linhas:
    print(linha)
i=1

while i < len(nome):
    print(nome[i])
    i=i+1

Browser other questions tagged

You are not signed in. Login or sign up in order to post.