你好,我有一个字符串,例如:
liste_to_split=['NW_011625257.1_0','scaffold1_3','scaffold3']
,我想在Number_Number
拆分它们
我尝试过:
for i in liste_to_split:
i.split(r'(?<=[0-9])_')
我知道了
['NW_011625257.1_0']
['scaffold1_3']
['scaffold3']
代替
['NW_011625257.1'] ['0']
['scaffold1'] ['3']
['scaffold3']
有人知道问题出在哪里吗?
答案 0 :(得分:1)
您可以使用:
>>> import re
>>> liste_to_split=['NW_011625257.1_0','scaffold1_3','scaffold3']
>>>
>>> for i in liste_to_split:
... re.split(r'(?<=[0-9])_', i)
...
['NW_011625257.1', '0']
['scaffold1', '3']
['scaffold3']
请注意使用re.split
而不是string.split
,并在断言后方使用_
来确保我们不会在零宽度匹配上进行分割。
根据OP在下面的评论,看来OP希望对dataframe
列进行此拆分。在这种情况下,请使用:
假设这是您的数据框:
>>> print (df)
column
0 NW_011625257.1_0
1 scaffold1_3
2 scaffold3
然后您可以使用:
>>> print (df['column'].str.split(r'(?<=[0-9])_', expand=True))
0 1
0 NW_011625257.1 0
1 scaffold1 3
2 scaffold3 None
答案 1 :(得分:1)
l=['NW_011625257.1_0','scaffold1_3','scaffold3']
for i in l:
f = i.split('_')
print(f)
输出
['NW', '011625257.1', '0']
['scaffold1', '3']
['scaffold3']