如何从DataFrame中单独的两列创建一个新的压缩列表项列?

时间:2019-11-26 19:34:31

标签: python pandas dataframe

这个问题是由我之前提出的问题-Pandas groupby make two columns lists separately引起的。这次,我想创建一个新列,其中每个值都是一个列表,其中包含来自其他两列的压缩值的元组。例如:

# Original DataFrame
      fruit      sport                       weather
0     apple      [baseball, basketball]      [sunny, windy]
1     banana     [swimming, hockey]          [cloudy, windy]
2     orange     [football]                  [sunny]


# Desired DataFrame
      fruit      sport                       weather             pairs
0     apple      [baseball, basketball]      [sunny, windy]      [(baseball, sunny), (basketball, windy)]
1     banana     [swimming, hockey]          [cloudy, windy]     [(swimming, cloudy), (hocky, windy)]
2     orange     [football]                  [sunny]             [(football, sunny)]

我已经尝试了以下代码,但是它给了我其他东西:

df['pairs'] = list(zip(df['sport'], df['weather']))

# Output DataFrame
      fruit      sport                       weather             pairs
0     apple      [baseball, basketball]      [sunny, windy]      ([baseball, sunny], [basketball, windy])
1     banana     [swimming, hockey]          [cloudy, windy]     ([swimming, cloudy], [hocky, windy])
2     orange     [football]                  [sunny]             ([football], [sunny])

如您所见,它与我想做的“相反”。我应该怎么做呢?预先感谢。

3 个答案:

答案 0 :(得分:2)

我认为您缺少另一个list(zip())

df['pairs'] = list(list(zip(a,b)) for a,b in zip(df['sport'], df['weather']))

输出:

    fruit    sport                       weather              pairs
 0  apple    ['baseball', 'basketball']  ['sunny', 'windy']   [('baseball', 'sunny'), ('basketball', 'windy')]
 1  banana   ['swimming', 'hockey']      ['cloudy', 'windy']  [('swimming', 'cloudy'), ('hockey', 'windy')]
 2  orange   ['football']                ['sunny']            [('football', 'sunny')]

答案 1 :(得分:1)

axis=1zip上使用DataFrame.apply

df['pairs'] = df.apply(lambda x: list(zip(x['sport'], x['weather'])), axis=1)
    fruit                   sport          weather                                     pairs
0   apple  [baseball, basketball]   [sunny, windy]  [(baseball, sunny), (basketball, windy)]
1  banana      [swimming, hockey]  [cloudy, windy]     [(swimming, cloudy), (hockey, windy)]
2  orange              [football]          [sunny]                       [(football, sunny)]

答案 2 :(得分:1)

您可以利用地图具有嵌入式zip 的事实,然后执行以下操作:

df['pairs'] = [list(x) for x in map(zip, df['sport'], df['weather'])]
print(df)

输出

    fruit  ...                                     pairs
0   apple  ...  [(baseball, sunny), (basketball, windy)]
1  banana  ...     [(swimming, cloudy), (hockey, windy)]
2  orange  ...                       [(football, sunny)]

[3 rows x 4 columns]

或者您可以使用itertuples

df['pairs'] = [list(zip(*x)) for x in df[['sport', 'weather']].itertuples(index=False)]