How to expand a dataframe based on a condition

Asked

Viewed 45 times

0

I have the following dataframe :

df = pd.DataFrame({ 
    'left_bound' : ['1', '4', '10', '25'], 
    'right_bound' : ['3', '9', '24', '50'], 
    'code' : ['a', 'b', 'c', 'd'], 
}) 

And I wanted to convert her to one like this:

df2 = pd.DataFrame({ 
    'bound' : ['1', '2','3','4', '5', '6', '7', '8', '9', '10'], 
    'code' : ['a', 'a', 'a', 'b','b','b','b','b','b','c'], 
}) 

Up to 50 in this case.

The idea is to have a sequence with all integers from Lower bound to upper bound in Rows and the code corresponding to that number in the next column.

Thank you!

  • Noble, good afternoon! I believe this question needs to be better explained. What makes 1 2 3 a a? What makes 4 5 6 b b b? Hug!

  • Good afternoon! I at first have a dataframe with 3 columns, left_bound, right_bound and code. In this case, all numbers between 1 and 3 inclusive have code 'a'. All numbers between 4 and 9 have code 'b', etc. That is, all numbers between left_bound and right_bound have that associated code. However I wanted to have only one column with all the numbers between the left_bound min and the right_bound max and another with the associated code! Hug!

1 answer

0

importing the pandas package

import pandas as pd

Creating the data frame

df = pd.DataFrame({ 
    'left_bound' : ['1', '4', '10', '25'], 
    'right_bound' : ['3', '9', '24', '50'], 
    'code' : ['a', 'b', 'c', 'd'], 
})

Turning columns into integer values

df['left_bound'] = df['left_bound'].astype('int64')
df['right_bound'] = df['right_bound'].astype('int64')

Creating a new column by capturing the range between a column and another

df['bound'] = df.apply(lambda x : list(range(x['left_bound'],x['right_bound'] + 1)),axis = 1)

Excluding the unnecessary columns

df.drop(columns=['left_bound','right_bound'], inplace = True)

Transforming the column lists bound in lines

df = df.explode('bound')

Changing the order of the columns

df = df[['bound','code']]

Printing the first 5 data frame records

df.head()

Exit:

    bound   code
0      1    a
0      2    a
0      3    a
1      4    b
1      5    b

Code:

df['left_bound'] = df['left_bound'].astype('int64')
df['right_bound'] = df['right_bound'].astype('int64')

df['bound'] = df.apply(lambda x : list(range(x['left_bound'],x['right_bound'] + 1)),axis = 1)
df.drop(columns=['left_bound','right_bound'], inplace = True)
df = df.explode('bound')

df = df[['bound','code']]
df.head()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.