Question

我想替换pandas系列中的模式，其中每行包含一个字符串列表。我们的想法是在列表中的每个字符串中搜索模式并属于一行。数据集有几行，这些特定的列由字符串列表组成。

input = {'1': [['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']]}
df = pd.DataFrame(input)
print(df)

现在我想替换所有＆＃39; a＆＃39;通过＆＃39; e＆＃39;在每行的每个字符串列表中。

Answer 1

这是单程

In [118]: df['1'].apply(lambda x: ['e' if v=='a' else v for v in x])
Out[118]:
0    [e, b, c, d]
1    [e, b, c, d]
2    [e, b, c, d]
3    [e, b, c, d]
4    [e, b, c, d]
Name: 1, dtype: object

另一种方式

In [119]: df['1'].apply(lambda x: map(lambda v: 'e' if v=='a' else v, x))
Out[119]:
0    [e, b, c, d]
1    [e, b, c, d]
2    [e, b, c, d]
3    [e, b, c, d]
4    [e, b, c, d]
Name: 1, dtype: object

或者，在所有列上使用df.applymap(lambdafunc)

详细

In [120]: df
Out[120]:
              1
0  [a, b, c, d]
1  [a, b, c, d]
2  [a, b, c, d]
3  [a, b, c, d]
4  [a, b, c, d]

Answer 2

使用默认值，使用dict.get这是一种有趣的方式。也使用理解

df['1'] = [[{'a': 'e'}.get(x, x) for x in r] for r in df['1'].values.tolist()]
df

              1
0  [e, b, c, d]
1  [e, b, c, d]
2  [e, b, c, d]
3  [e, b, c, d]
4  [e, b, c, d]

Answer 3

上述答案有效，但这一点也有效：

input = {'1': [['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']]}
df = pd.DataFrame(input)
df = df['1'].apply(lambda x: [v.replace("a", "e") for v in x])
print(df)

输出：

0    [e, b, c, d]
1    [e, b, c, d]
2    [e, b, c, d]
3    [e, b, c, d]
4    [e, b, c, d]

Answer 4

让我们重建Dataframe

df=pd.DataFrame({'1':df['1'].apply(pd.Series).replace({'a':'e'}).values.tolist()})
df
Out[274]: 
              1
0  [e, b, c, d]
1  [e, b, c, d]
2  [e, b, c, d]
3  [e, b, c, d]
4  [e, b, c, d]

替换pandas系列上的模式，其中每行包含字符串列表

4 个答案: