背景
我有以下玩具df
,其包含在列Before
和After
中的列表,如下所示
import pandas as pd
before = [list(['in', 'the', 'bright', 'blue', 'box']),
list(['because','they','go','really','fast']),
list(['to','ride','and','have','fun'])]
after = [list(['there', 'are', 'many', 'different']),
list(['i','like','a','lot','of', 'sports']),
list(['the','middle','east','has','many'])]
df= pd.DataFrame({'Before' : before,
'After' : after,
'P_ID': [1,2,3],
'Word' : ['crayons', 'cars', 'camels'],
'N_ID' : ['A1', 'A2', 'A3']
})
输出
After Before N_ID P_ID Word
0 [in, the, bright, blue, box] [there, are, many, different] A1 1 crayons
1 [because, they, go, really, fast] [i, like, a, lot, of, sports ] A2 2 cars
2 [to, ride, and, have, fun] [the, middle, east, has, many] A3 3 camels
问题
使用以下代码块:
来自Removing commas and unlisting a dataframe的 df.loc[:, ['After', 'Before']] = df[['After', 'Before']].apply(lambda x: x.str[0].str.replace(',', ''))
产生以下输出:
接近我想但不完全输出
After Before N_ID P_ID Word
0 in there A1 1 crayons
1 because i A2 2 cars
2 to the A3 3 camels
此输出是接近的,但与我要查找的输出不完全相同,因为当我想要的输出看起来像这样时,After
和Before
列只有一个单词输出(例如there
): / p>
所需的输出
After Before N_ID P_ID Word
0 in the bright blue box there are many different A1 1 crayons
1 because they go really fast i like a lot of sports A2 2 cars
2 to ride and have fun the middle east has many A3 3 camels
问题
如何获取我的所需输出?
答案 0 :(得分:4)
agg
+ join
。逗号不在列表中,只是列表__repr__
的一部分。
str_cols = ['Before', 'After']
d = {k: ' '.join for k in str_cols}
df.agg(d).join(df.drop(str_cols, 1))
Before After P_ID Word N_ID
0 in the bright blue box there are many different 1 crayons A1
1 because they go really fast i like a lot of sports 2 cars A2
2 to ride and have fun the middle east has many 3 camels A3
如果您希望就位(更快):
df[str_cols] = df.agg(d)
答案 1 :(得分:3)
applymap
具有预期结果的数据框的新副本
df.assign(**df[['After', 'Before']].applymap(' '.join))
Before After P_ID Word N_ID
0 in the bright blue box there are many different 1 crayons A1
1 because they go really fast i like a lot of sports 2 cars A2
2 to ride and have fun the middle east has many 3 camels A3
更改现有的df
df.update(df[['After', 'Before']].applymap(' '.join))
df
Before After P_ID Word N_ID
0 in the bright blue box there are many different 1 crayons A1
1 because they go really fast i like a lot of sports 2 cars A2
2 to ride and have fun the middle east has many 3 camels A3
stack
和str.join
我们可以使用类似于上图所示的“在线”和“就地”方式使用此结果。
df[['After', 'Before']].stack().str.join(' ').unstack()
After Before
0 there are many different in the bright blue box
1 i like a lot of sports because they go really fast
2 the middle east has many to ride and have fun
答案 2 :(得分:2)
我们可以指定要转换为字符串的列表,然后在for循环中使用.apply
:
lst_cols = ['Before', 'After']
for col in lst_cols:
df[col] = df[col].apply(' '.join)
Before After P_ID Word N_ID
0 in the bright blue box there are many different 1 crayons A1
1 because they go really fast i like a lot of sports 2 cars A2
2 to ride and have fun the middle east has many 3 camels A3