Question

让我们假设我在DataFrame中有这个pandas：

    year    text_1                 text_2
0   1999    ['Sunny', 'weather']   ['Foggy', 'weather']
1   2005    ['Rainy, 'weather']    ['Cloudy', 'weather']

我想将其转换为此：

    year    text_1           text_2
0   1999    'Sunny weather'  'Foggy weather'
1   2005    'Rainy weather'  'Cloudy weather'

出于这个原因，我这样做：

df[['text_1', 'text_2']] = df[['text_1', 'text_2']].apply(lambda x: ' '.join(x), axis=1)

但随后出现以下错误：

TypeError: ('sequence item 0: expected str instance, list found', 'occurred at index 0')

我也这样做：

df = df.apply(lambda x: ' '.join(x['text_1'], x['text_2'],), axis=1)

但随后出现以下错误：

TypeError: ('join() takes exactly one argument (2 given)', 'occurred at index 0')

如何将此功能应用于多列（一行）？

我说的是一行，因为我可以在每一列分别应用该函数或定义一个函数并调用它以使其起作用。

但是，我正在寻找最简洁的解决方案。

Answer 1

如果需要明智地处理每个值元素，请使用dirname：

df[['text_1', 'text_2']] = df[['text_1', 'text_2']].applymap(' '.join)
print (df)
   year         text_1          text_2
0  1999  Sunny weather   Foggy weather
1  2005  Rainy weather  Cloudy weather

或将DataFrame.applymap与DataFrame.apply组合：

df[['text_1', 'text_2']] = df[['text_1', 'text_2']].apply(lambda x: x.str.join(' '))

Answer 2

样本数据

             A             B
0  [asdf, asf]  [eeee, tttt]

df['combined'] = df.apply(lambda x: [' '.join(i) for i in list(x[['A','B']])], axis=1)

输出

             A             B               combined
0  [asdf, asf]  [eeee, tttt]  [asdf asf, eeee tttt]

更新

df[['A','B']] = df.apply(lambda x: pd.Series([' '.join(x['A']),' '.join(x['B'])]), axis=1)

输出

          A          B
0  asdf asf  eeee tttt

将lambda函数应用于多列

2 个答案: