我想拆分这个mystring =" 0G15 ^ GAC0T60T4 ^ AA0C0"并使用python获取以下输出:
['0','G','15','^GAC','T','60','T','4','^AA','C']
可以在R:
中使用此命令完成mystring <- "0G15^GAC0T60T4^AA0C0"
gsub("([\\^]*[ACGT]+)[0]*", " \\1 ", mystring)
如何将R脚本翻译成python?
由于
答案 0 :(得分:5)
您可以使用Pythons re
module
import re
mystring = "0G15^GAC0T60T4^AA0C0"
l = re.sub("([\\^]*[ACGT]+)[0]*", " \\1 ", mystring).split()
然后 l
['0', 'G', '15', '^GAC', 'T', '60', 'T', '4', '^AA', 'C']
答案 1 :(得分:1)
你可以试试这个:
mystring="0G15^GAC0T60T4^AA0C0"
import re
new_data = re.findall('(?<!\^[GAC])\d+|(?<!\^)\w|\^[a-zA-Z]+', mystring)
final_data = [a for i, a in enumerate(new_data) if a != '0' or not new_data[i-1].startswith("^")][:-1]
输出:
['0', 'G', '15', '^GAC', 'T', '60', 'T', '4', '^AA', 'C']