Question

假设我有一个pandas数据帧，其中2列由字符串列表组成，如下所示

df=pd.DataFrame( {'A' : [ ['a','b','c'], ['d','e','f'] ], 'B':[ ['g','h','i'], ['j','k','l'] ] })

我希望将A中第一个列表中的第一个字符串元素与带有连字符的B中第一个列表中的第一个字符串元素连接起来，依此类推。最终产品将是另一列C，以便

df['C'] = [ ['a-g','b-h','c-i'], ['d-j','e-k','f-l' ] ]

我尝试了使用apply（）和map（）的不同功能，但没有任何东西产生所需的结果，任何帮助都表示赞赏。

Answer 1

您可以使用str.cat：

df['C'] = df.A.str.cat(df.B, sep='-')

df
#   A   B   C
#0  a   e   a-e
#1  b   f   b-f
#2  c   g   c-g
#3  d   h   d-h

或直接添加两列：

df.A + '-' + df.B

#0    a-e
#1    b-f
#2    c-g
#3    d-h
#dtype: object

对编辑过的数据

更新：

df=pd.DataFrame({'A':[['a','b','c'], ['d','e','f']], 'B':[['g','h','i'], ['j','k','l']]})

df['C'] = df.apply(lambda r: [a+'-'+b for a,b in zip(r.A, r.B)], axis=1)

df
#           A           B                 C
#0  [a, b, c]   [g, h, i]   [a-g, b-h, c-i]
#1  [d, e, f]   [j, k, l]   [d-j, e-k, f-l]

Answer 2

我会使用apply + np.core.defchararray.add：

执行此操作

from numpy.core.defchararray import add

df['C'] = df[['A', 'B']].apply(lambda x: add(add(x.A, '-'), x.B).tolist(), 1)
df

           A          B                C
0  [a, b, c]  [g, h, i]  [a-g, b-h, c-i]
1  [d, e, f]  [j, k, l]  [d-j, e-k, f-l]

请记住我所说的关于在列表中存储数据的内容。

如果您的列可能不是同等大小，则可以执行if检查：

def foo(x):
    if len(x.A) == len(x.B):
        return add(add(x.A, '-'), x.B).tolist()
    return []

df['C'] = df[['A', 'B']].apply(foo, 1)

Answer 3

选项1
使用numpy.core.defchararray.add

from numpy.core.defchararray import add

a = np.array(df.values.tolist())

df.assign(C=add(add(a[:, 0], '-'), a[:, 1]).tolist())

           A          B                C
0  [a, b, c]  [g, h, i]  [a-g, b-h, c-i]
1  [d, e, f]  [j, k, l]  [d-j, e-k, f-l]

选项2
使用list的自定义子类并重新定义+

的有趣方式

class list_(list):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __add__(self, other):
        return list_(map('-'.join, (map(str, t) for t in zip(self, other))))

df.assign(C=df.applymap(list_).sum(1).apply(list))

           A          B                C
0  [a, b, c]  [g, h, i]  [a-g, b-h, c-i]
1  [d, e, f]  [j, k, l]  [d-j, e-k, f-l]

pandas数据帧中两个字符串列表之间的一对一映射

3 个答案: