Import csv from Bucket S3

Asked

Viewed 287 times

1

I am trying to import this a. csv file from Amazon’s S3 Bucket using the code below:

import csv
with open('https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv', 'rb') as teste:
    reader = csv.reader(teste)
    for linha in reader:
        print linha

I am using the Pycharm IDE with Python 2.7.13 that is already installed on Mac. and I get a message that the link does not exist and when I call the link in the browser it opens quietly.

Does anyone know if I need any library to be able to read this csv that is in S3?

Remembering that I did a test with a local csv and it worked.

1 answer

0

When you do open('qualquer string') python tries to "resolve" 'any string' as a local file, as you put a url, the file is not found. Here is a solution to read the url file:

from contextlib import closing    
import csv, io, requests, urllib.request, codecs
from contextlib import closing

url = 'https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv'

with closing(requests.get(url, stream=True)) as data:
    reader = csv.reader(codecs.iterdecode(data.iter_lines(), 'utf-8'))
    for row in reader:
        print (row) 
  • Hello Sidon. Thanks for the help. I did it the way below and tb worked (I’m learning rsrs ) . What I need Fz now is to calculate the number of lines that exist in the date set and average the:Tip_amount field. The problem I can’t extract the data and play in a variable to calculate

  • url = urllib.request.urlopen("https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv") Reader = csv.Reader(url, delimiter=',', quoting=csv.QUOTE_NONE) for Row in Reader: x = Reader.ix[:,1] y=Reader.ix[4] ##Calculate the amount of Rows from this Date Set print("QUANTITY:") print(x.Count()) #Average Field: Tip_amount print("AVERAGE:") print(y. Mean())

  • What do you mean? Reader is already the dataset itself. Just iterate on it like I did to print the lines.

  • Following your advice I did this, + my error:Typeerror: a bytes-like Object is required, not 'str' tip_amount_total = 0 tip_amount_quantity = 0 for Row in file: tip_amount_total += l.split(",")[15] #here add the fifth item of the line, sum to a total value tip_amount_quantity += 1 # increments the amount of tip_amount, basically a counter media_tip_amount = tip_amount_total/tip_amount_quantity

  • I’m still crawling with programming. So I go on trial and error. :)

  • Better try to learn the right way, the first thing is to try to interpret the error messages, then try to use the advantages of python, one of them is q vc can "print" the variables (and their types) in the middle of the code. The trial and error is very valid in the programming, since you first know what is the origin of the error and the meaning of the message.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.