我有一个数据框,其中的一列在每行中都有一个字符串列表。但是每个字符串都有我必须删除的数字和句点。我无法访问每一行中列表的字符串,这是示例数据框:
df['column_name']
output:
['1.one','2.two','3. three','4.four ']
['1.one','2.two','3. three','4.four ','5.five']
['1.one','2.two','3. three']
...
我尝试如下,我的输出是:
df4['column_name'].str[0].str.replace('\d+\.','')
output:
one
one
one
...
但是我需要这样的输出:
df4['column_name'].str[0].str.replace('\d+\.','')
output:
'one', 'two', 'three', 'four'
同样,我必须遍历数据帧的所有行,:(。任何帮助将不胜感激!!!
答案 0 :(得分:1)
您可以尝试此操作,以获取字符串类型的列:
df['column_name'].str.join(',').str.replace('\d+\.|[ ]','').str.replace(',',', ')
或者通过此操作获取类型为list的列:
df['column_name'].str.join(',').str.replace('\d+\.|[ ]','').str.split(',')
输出:
#first solution:
0 one, two, three, four
1 one, two, three, four, five
2 one, two, three
Name: column_name, dtype: object
#second solution:
0 [one, two, three, four]
1 [one, two, three, four, five]
2 [one, two, three]
Name: column_name, dtype: object