我有一个如下数据框:
import pandas as pd
import numpy as np
df = pd.DataFrame({'col1':['AA_L8_ZZ', 'AA_L08_YY', 'AA_L800_XX', 'AA_L0008_CC']})
df
col1
0 AA_L8_ZZ
1 AA_L08_YY
2 AA_L800_XX
3 AA_L0008_CC
我想删除字符“ L”后的全0。 我的预期输出:
col1
0 AA_L8_ZZ
1 AA_L8_YY
2 AA_L800_XX
3 AA_L8_CC
答案 0 :(得分:2)
In [114]: import pandas as pd
...: import numpy as np
...: df = pd.DataFrame({'col1':['AA_L8_ZZ', 'AA_L08_YY', 'AA_L800_XX', 'AA_L0008_CC']})
...: df
Out[114]:
col1
0 AA_L8_ZZ
1 AA_L08_YY
2 AA_L800_XX
3 AA_L0008_CC
In [115]: df.col1.str.replace("L([0]*)","L")
Out[115]:
0 AA_L8_ZZ
1 AA_L8_YY
2 AA_L800_XX
3 AA_L8_CC
Name: col1, dtype: object
答案 1 :(得分:1)
熊猫string replace就足够了。以下代码查找任何0
,后跟L
,然后用空字符串替换0
:
df.col1.str.replace(r"(?<=L)0+", "")
0 AA_L8_ZZ
1 AA_L8_YY
2 AA_L800_XX
3 AA_L8_CC
如果需要更高的速度,可以使用list comprehension
进入普通的Python:
import re
df["cleaned"] = [re.sub(r"(?<=L)0+", "", entry) for entry in df.col1]
df
col1 cleaned
0 AA_L8_ZZ AA_L8_ZZ
1 AA_L08_YY AA_L8_YY
2 AA_L800_XX AA_L800_XX
3 AA_L0008_CC AA_L8_CC