Writing a csv file on Google Drive using Colab

Asked

Viewed 714 times

0

I’m writing a code in Python to scrape information off Facebook. I would like to save this information in a file on Google Drive, since I am working with other people and we use Colaboratory.

Problem

The code does not write the file in Google Colab.

Code

import facebook_scraper
import pandas as pd
import csv
from facebook_scraper import get_posts

listaBibliotecas = ["bibliotecafoa"]

for biblioteca in listaBibliotecas:
  print("Biblioteca: " + biblioteca) 
for post in get_posts(biblioteca, pages=300):
  post['title'] = biblioteca
  print(post['title'])    
  print(post['post_id'])
  print(post['time'])
  print(post['text'])
  print(post['image'])
  print(post['video'])
  print(post['likes'])
  print(post['comments'])
  print(post['shares'])
  print(post['link'])

  data = [post['title'],post['post_id'], post['time'], post['text'], post['image'], post['video'], post['likes'], post['comments'], post['shares'], post['link']]
  df=pd.DataFrame(data)    
  
  with open("RedesBibliotecas.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(data)
 
from google.colab import drive
drive.mount ('/drive')
df.to.csv('/drive/My Drive/Colab Notebooks');

What I tried to

I followed the Colab tutorial but it didn’t work. I already set up the drive with the code:

from google.colab import drive
drive.mount ('/drive')
df.to.csv('/drive/My Drive/Colab Notebooks');

Someone knows how to fix this?

  • What says the error message?

  • that Colab Notebooks is written just like that. With space? If yes, try saving after space removal

  • @Lucas hi!! Has no error message, just does not write the file in csv. The program runs normal.

  • @Lucas Hey, thank you for answering. It’s with space yes, both My Drive and Google Colab. At least that’s how it is in Colab’s own tutorial.

  • Do you have access to the google colab terminal? If so, make sure you can transfer your session files to Drive using the command gsutil. Leai sobre aquui: https://cloud.google.com/storage/docs/gsutil In the latter case, save the file to Google Storage and send it to Drive from there

  • Hi Lucas! Thanks for the reading option. I checked the connection and there is nothing wrong, the problem is in the code, I think.

  • Clara, possibly you don’t have much success with programming - when I answered I noticed some details like exchange "_" for "." , etc -- a computer program nay it will work if we exchange things like this -the names and the syntax has no ambiguity or space to write things "a little different" - the cool may be to interact with Python in interactive mode, with small examples, for example, create a list, insert an element, address an element in the list.You can use colab even for this, in a cell with a few lines of code - d 1 to 5 and have it executed

Show 2 more comments

1 answer

1


The call to "mount" makes google drive available to the program in Python as if it were in the "/drive" folder (under Unix, unlike Windows, directories do not have a drive letter before - "/" indicates the root of the file system, as if it were "C:" on a Windows with a single partition).

Then your google-drive content is in the folder /drive/MyDrive (spaceless between "My" and "Drive").

Which means any file you create from the folder /drive/MyDrive/ with Python gets persisted in your google-drive, and works like a normal Python file. The third component of the "Colab Notebooks" path would already be a folder inside your drive.

So, if you put " " between "My" and "Drive" and did not give error, it may be that the file has ended up in some "limbo" in your google-drive, which cannot be accessed normally because it is not inside "Mydrive".

And, the way you wrote it, anyway, the "csv" file itself will go with the name "Colab Notebooks" (without the extension ". csv" - when you create a file through a computer program you always have to include the extension> specific programs like "word" or "excel" know how to complete their extension (docx, xlsx), but precisely why this is programmed in them).

There is still another error in your program, which would cause it to not run even if the rest works - the Dataframe method for creating a CSV file is to_csv, and not to.csv. This alone would give an error, so I’m wondering then your information that "The program runs normal" as you wrote in the comment - you’re actually running the cell with this code by pressing "shift + enter" (or by the menus Runtime/runall ?)

In short, this should work:


...

from google.colab import drive
drive.mount ('/drive')
df.to_csv('/drive/MyDrive/meu_arquivo_csv.csv')

The first part of the program is also wrong - even if the part of collecting the facbok data is correct, and the impressions appear, you are sending Pyhton to record data directly, without going through Pandas, in the file "Redebibliotecas.csv" - so far so good (this file should be visible by the Python script itself, within the colab environment) - but you create a new dataframe data for each processed record - what will come in Ra to record the dataframe in the last part of the program is just the last record.

  • Hello, @jsbueno! First, thanks for the lesson, it was worth the courage to write here. It was very enlightening. Yes, the program runs without errors, at least the scraping part, but on recording, I could not advance. I can write CSV using Visual Code and modify the code to with open("20201202fbcode.csv", "a", newline="", encoding="utf-8") as f: Writer = csv.Writer(f) Writer.writerow(date)

  • I’m getting a missing attribution error from the df dataframe in the last line of the code. From your reply I did not understand the following comment "The first part of the program is also wrong - even if the part of collecting the facbok data is correct, and the impressions appear, you are sending Pyhton to record data directly, without going through Pandas, in the file "Redebibliotecas.csv" - so far so good - but you create a new data dataframe for each processed record - what will arrive at Save the dataframe in the last part of the program is only the last record."

Browser other questions tagged

You are not signed in. Login or sign up in order to post.