Using Values from a Dictionary Column to Create a New Column in a Dataframe

Asked

Viewed 370 times

-1

I am trying to extract country names from a string type column that I have to create a new column in my Dataframe with just country names. The format is below:

["[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'DE', 'name': 'Germany'}, {'iso_3166_1': 'US', 'name': 'United States of America'}]",
 "[{'iso_3166_1': 'US', 'name': 'United States of America'}]"]

It is important to note that some elements of the list contain two dictionaries, such as line 7th that has United States and Germany.

I thought of turning this column into a dictionary and then extracting the values from key name, but my loop fails when encountering problems like 7th line. For example:

Countries_Movie = []
for k in range(0,len(movies_datasets['production_countries'])):
    if (type(movies_datasets.production_countries[k]) == str): 
        mv_inter = movies_datasets.production_countries[k].replace('[',"").replace(']',"")
        mv_inter = ast.literal_eval(mv_inter)
        mv_inter = mv_inter.get('name')
        Countries_Movie.append(mv_inter)

Can you please suggest me a more efficient method or help me understand what is missing from my code?

Sincerely yours.

1 answer

0


I managed to extract the names by creating a function. What happens is that the column has a JSON format, that is, each row is an object that can be converted to a list.

For more information access:

What is JSON? What is it for and how it works?

Therefore, I have created the following function:

def extract_names(mv_str):
    if isinstance(mv_str, float):
        pass
    else:
        values = []
        mv_str = ast.literal_eval(mv_str)
        if isinstance(mv_str, list):
            for mv_str_i in mv_str:
                values.append(mv_str_i['name'])
            return values
        else:
            return None

I select the line that is an object JSON and turn into a dictionary, then I go through the list and select the values of the key name. Result is as below:

0   [United States of America]
1   [United States of America]
2   [United States of America]
3   [United States of America]
4   [United States of America]
5   [United States of America]
6   [Germany, United States of America]

Sincerely yours.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.