How do I write a tabular file in an instance of Azure Data Lake Store with the Python API?

Asked

Viewed 82 times

0

Suppose I have an instance of Data Lake Store in my Azure inscription and I would like to create a Python script to write a tabular file with tab (CSV type or similar) in that instance.

Without considering the Spark API, how can I write this tabular file in the Data Lake Store instance using the Python API of writing files?

  • Please administrators, I would like to suggest the creation of the "Azure-data-Lake-store" tag, given the existence of other private Azure domains already created, such as "Azure-devops". The Azure Data Lake Store is perhaps one of the most important features of Azure today.

1 answer

0

I don’t use Azure, so forgive me if it’s not relevant: I’ll reply how to write data to a CSV file using Python3.x.

It is worth mentioning before that most libraries include a method to convert to CSV directly, such as the Pandas and the Numpy. Possibly (I’m speculating, yes) you can redirect your Python data back to the Azure environment and use a native tool.

A simple example of how to write a CSV file using Python:

import csv
lista = ['Alice','Bob','Carlos']
lista2 = ['Xavier','Yago','Zulmira']

with open('nomes.csv', 'w', newline='') as csvfile:
    namewriter = csv.writer(csvfile)
    namewriter.writerow(lista)
    namewriter.writerow(lista2)

inserir a descrição da imagem aqui

  • Rocchi, it’s a valid answer, but in the Azure Data Lake Store, because it’s a distributed file system, there are some limitations to writing directly with the Python API like this, since this API does not understand the distributed file system behind this type of system as does, for example, Spark.

  • 1

    I understand. And is it possible for you to quit running the Python script by loading the data you need? For example. I once had to write a Bash script and make calculations that were not supported, so within the Bash script I started a Python script matching it to a variable. After execution, the variable was loading the output data from the Python script. In other words: you can try to match the result of running the Python script to a variable in the other environment (Powershell??).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.