At what stage should the data be edited?

Asked

Viewed 59 times

2

I am currently removing data from a website, with data in English, through web scraping.

If we want, for example, to translate the names or values of the fields into Portuguese, or to complete abbreviations, what is the most appropriate approach:

  • Make the change during the web scraping phase?
  • Or just make the change after you have the raw data in a file or database?

1 answer

1


After a more exhaustive search I discovered that the process I referred to is called data munging (also known as wrangling of data), which involves the cleaning of extracted data to a more convenient use format (as well as its aggregation, visualization and training of statistical models, among others).

The clearest and most accessible approach should be a separation between data acquisition and data munging.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.