transform list items into separate columns or extend dataframe to the end

Asked

Viewed 1,079 times

0

I have a class with an element that is a list I’m trying to display in a pandas dataframe this list in a single line to represent the character’s inventory.

assignment of items in the list:

if self.wealth == "rich":
        self.inventory = ["dagger","nobles's clothing", "cloak","backpack","rations for a week","waterskin",
                          "potion of healing","pouch for coins","personal servant","personal guard", " three saddled horses"]

I am making the dataframe this way but end up cutting the list for being too big, I would like to do it in a way that does not cut this line.

inventory = pd.DataFrame({"Inventory": [self.inventory]," ": " "})
inventory.set_index(" ", inplace=True)

display(inventory)

3 answers

3

If you need to merge the list elements inventory in a single string can do it this way:

inventory = ["dagger","nobles's clothing","cloak",
             "backpack","rations for a week","waterskin",
             "potion of healing","pouch for coins","personal servant",
             "personal guard", "three saddled horses"]

invent = ",".join(inventory)

print(f"Inventory: {invent}") 

Exit:

Inventory: dagger,nobles's clothing,cloak,backpack,rations for a week,waterskin,potion of healing,pouch for coins,personal servant,personal guard,three saddled horses
  • I ended up solving otherwise but did not know about this use of Join() obg by info.

  • Glad you got @pydoni, hug and see you next time!

1

I was able to solve by transforming each item in the list into a column in this way

    idf = pd.DataFrame({"Inventory": [self.inventory]})
    idf = idf["Inventory"].apply(pd.Series)
    idc = pd.DataFrame({" ": ["Inventory"]})#usei isso para deixar como um index mais bonito
    idf = idf.rename(columns = lambda x : "item_" + str(x))
    inventory = pd.concat([idc[:],idf[:]], axis=1)
    inventory.set_index(" ", inplace=True)

    display(inventory)
  • Use apply(pd.Series) can cost a lot of processing time depending on the size of the data you are working on. Take a look at my answer :)

1

A better (faster) alternative would be to create a new Dataframe by converting the column Inventory for an array numpy with value, thus:

df = pd.DataFrame(idf["Inventory"].values.tolist())
df.index = ['Inventory']
df.columns = ["item_" + str(x) for x in df.columns]

Using the library timeit, it is possible to see the difference in runtime

%%timeit

idf2 = idf["Inventory"].apply(pd.Series)
idc = pd.DataFrame({" ": ["Inventory"]})#usei isso para deixar como um index mais bonito
idf2 = idf2.rename(columns = lambda x : "item_" + str(x))
inventory = pd.concat([idc[:],idf2[:]], axis=1)
inventory.set_index(" ", inplace=True)

6.36 ms 149 µs per loop (Mean Std. dev. of 7 runs, 100 loops each)

%%timeit

df = pd.DataFrame(idf["Inventory"].values.tolist())
df.index = ['Inventory']
df.columns = ["item_" + str(x) for x in df.columns]

3.27 ms 88.3 µs per loop (Mean Std. dev. of 7 runs, 100 loops each)

Almost 2x faster!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.