我在处理用户输入时遇到麻烦。
Input: C6H12O6
Expected output: ["C",6, "H", 12, "O", 6]
我想检查符号是否与原子元素相对应 是有效的,这已经存储在我的数据库中。但是我很难获得这样的数组输出。
def createcompound(self, query):
validatom = False
query = 'C6H12O6'
result = []
firstcompound = query[0:query.find(" ")]
for char in firstcompound:
for atom in self.atoms:
if char == atom.symbol:
validatom = True
symbolcount = 0
if validatom:
for char in firstcompound:
if not (char.isdigit()):
symbolcount += 1
print (firstcompound[0:symbolcount])
print (firstcompound[symbolcount::])
更重要的是,化学式的其他输出也需要工作,但到目前为止,只有某些情况下使用O(n ^ 2)可以工作
如何在本机python 3.6中这样做?
答案 0 :(得分:1)
这会将数字和字母分成列表的不同元素。
import re
str1="C6H12O6"
match = re.findall(r"([A-z]+)([0-9]*)", str1)
lst=[]
for item in match:
x,y=item
lst.append(x)
lst.append(y)
print([x for x in lst if x])
输出
['C', '6', 'H', '12', 'O', '6']
但这并不完美。例如,“ CO2”将被分为['CO',2 ]
NOT ['C','O','2 ]
答案 1 :(得分:1)
您可以使用itertools.groupby
作为分组标准,并结合str.isdigit()
和整数解析来利用list comprehension / generator comprehension来获得输出:
from itertools import groupby
def tryParseInt(x):
"""Tries to parse and return x as integer, on error returns x as is"""
try:
return int(x)
except: # catches any error - you might opt to only catch ValueError
return x
def split_groupby(text):
"""Splits a text at digit vs. character borders, returns list of characters
and integers it detects. Uses str.isdigit to differentiate groups:
'H2SeO4'-> ['H',2,'SeO',4]"""
groupings = groupby(text,str.isdigit)
# return it as list or generator - I prefer generator
# return [ tryParseInt(''.join(grp[1])) for grp in groupings ]
yield from (tryParseInt(''.join(grp[1])) for grp in groupings )
text = "C6H12O6"
print(list(split_groupby(text)))
输出:
['C', 6, 'H', 12, 'O', 6]
这可以通过将字符串分为str.isdigit() == True
和str.isdigit() == False
的组来工作-并尽可能将找到的组解析为整数。
要正常工作-一次出现的元素也需要此说明符:'C1H3C1H2O1H1'
要正确分成“化学”元素-如果不正确,它将被拆分为['CH',3,'CH',2,'OH']
。
要彼此分离“正确”的拼写元素(例如“ H2SeO4”),可以对结果进行后处理:
def split_elems(formula):
"""Takes a list and splits strings inside it into title()'d pieces.
Replaces the former string with the split stings:
['H',2,'SeO',4] -> ['H',2,'Se','O',4]"""
for idx, name in enumerate(formula[:]):
if isinstance(name,str):
if sum(c.isupper() for c in name)>1:
tmp = []
for c in name:
if c.isupper():
tmp.append([c])
else:
tmp[-1].append(c)
formula.pop(idx)
for t in tmp[::-1]:
formula.insert(idx,"".join(t))
return formula
text = "H2SeO4"
print(list(split_groupby(text))) # ['H', 2, 'SeO', 4]
print(split_elems(list(split_groupby(text)))) # ['H', 2, 'Se', 'O', 4]
您也可以使用正则表达式-在问题solution using re.split()
中可以找到一个Split digit and text by regexp
答案 2 :(得分:0)
这是使用itertools.groupby
和operator.itemgetter
的另一种解决方案:
from itertools import groupby
from operator import itemgetter
def key_func(x):
"""Groups increasing digits"""
index, digit = x
return index - int(digit) if digit.isdigit() else x
def map_int(x):
"""Maps integers"""
return int(x) if x.isdigit() else x
def group_chemicals(x):
"""Groups chemicals using groupby"""
return (
"".join(map(itemgetter(1), g)) for _, g in groupby(enumerate(x), key=key_func)
)
s = "C6H12O6"
print(list(map(map_int, group_chemicals(s))))
# ['C', 6, 'H', 12, 'O', 6]