Question

我有一个数据框，其中的一列在每行中都有一个字符串列表。但是每个字符串都有我必须删除的数字和句点。我无法访问每一行中列表的字符串，这是示例数据框：

df['column_name']
output:
['1.one','2.two','3. three','4.four ']
['1.one','2.two','3. three','4.four ','5.five']
['1.one','2.two','3. three']
...

我尝试如下，我的输出是：

df4['column_name'].str[0].str.replace('\d+\.','')
output:
one
one
one
...

但是我需要这样的输出：

df4['column_name'].str[0].str.replace('\d+\.','')
output:
'one', 'two', 'three', 'four'

同样，我必须遍历数据帧的所有行，:(。任何帮助将不胜感激！！！

Answer 1

您可以尝试此操作，以获取字符串类型的列：

df['column_name'].str.join(',').str.replace('\d+\.|[ ]','').str.replace(',',', ')

或者通过此操作获取类型为list的列：

df['column_name'].str.join(',').str.replace('\d+\.|[ ]','').str.split(',')

输出：

#first solution:
0          one, two, three, four
1    one, two, three, four, five
2                one, two, three
Name: column_name, dtype: object

#second solution:

0          [one, two, three, four]
1    [one, two, three, four, five]
2                [one, two, three]
Name: column_name, dtype: object

从熊猫的列表列中，访问列表中的每个字符串以删除数字和句点

1 个答案: