我有一个数据框,其中包含一个列中的元组列表。我需要将列表元组拆分为相应的列。我的数据框df如下所示: -
A B
[('Apple',50),('Orange',30),('banana',10)] Winter
[('Orange',69),('WaterMelon',50)] Summer
预期输出应为:
Fruit rate B
Apple 50 winter
Orange 30 winter
banana 10 winter
Orange 69 summer
WaterMelon 50 summer
答案 0 :(得分:1)
这应该有效:
fruits = []
rates = []
seasons = []
def create_lists(row):
tuples = row['A']
season = row['B']
for t in tuples:
fruits.append(t[0])
rates.append(t[1])
seasons.append(season)
df.apply(create_lists, axis=1)
new_df = pd.DataFrame({"Fruit" :fruits, "Rate": rates, "B": seasons})[["Fruit", "Rate", "B"]]
输出:
Fruit Rate B
0 Apple 50 winter
1 Orange 30 winter
2 banana 10 winter
3 Orange 69 summer
4 WaterMelon 50 summer
答案 1 :(得分:1)
您可以将DataFrame
构造函数与numpy.repeat
和numpy.concatenate
一起使用:
df1 = pd.DataFrame(np.concatenate(df.A), columns=['Fruit','rate']).reset_index(drop=True)
df1['B'] = np.repeat(df.B.values, df['A'].str.len())
print (df1)
Fruit rate B
0 Apple 50 Winter
1 Orange 30 Winter
2 banana 10 Winter
3 Orange 69 Summer
4 WaterMelon 50 Summer
chain.from_iterable
的另一种解决方案:
from itertools import chain
df1 = pd.DataFrame(list(chain.from_iterable(df.A)), columns=['Fruit','rate'])
.reset_index(drop=True)
df1['B'] = np.repeat(df.B.values, df['A'].str.len())
print (df1)
Fruit rate B
0 Apple 50 Winter
1 Orange 30 Winter
2 banana 10 Winter
3 Orange 69 Summer
4 WaterMelon 50 Summer
答案 2 :(得分:0)
您可以在链式操作中执行此操作:
(
df.apply(lambda x: [[k,v,x.B] for k,v in x.A],axis=1)
.apply(pd.Series)
.stack()
.apply(pd.Series)
.reset_index(drop=True)
.rename(columns={0:'Fruit',1:'rate',2:'B'})
)
Out[1036]:
Fruit rate B
0 Apple 50 Winter
1 Orange 30 Winter
2 banana 10 Winter
3 Orange 69 Summer
4 WaterMelon 50 Summer