Question

我正在使用一个从文本文件中读取的解析器，并返回如下的词：

{'m/z array': array([  345.1,   370.2,   460.2,  1673.3,  1674. ,  1675.3]),
'charge array': array([ 3,  2,  1,  1,  1,  1]),
'params': {'username': 'Lou Scene', 'useremail': 'leu@altered-state.edu',
'mods': 'Carbamidomethyl (C)', 'itolu': 'Da', 'title': 'Spectrum 2',
'rtinseconds': '25', 'itol': '1', 'charge':`enter code here` '2+ and 3+',
'mass': 'Monoisotopic', 'it_mods': 'Oxidation (M)',
'pepmass': (1084.9, 1234.0),
'com': 'Based on http://www.matrixscience.com/help/data_file_help.html',
'scans': '3'},
'intensity array': array([  237.,   128.,   108.,  1007.,   974.,    79.])}

我正在尝试读取整个文件（所有dicts）并将它们存储在一个对象中以传递给第二个函数，因此脚本不必每次都从文件中读取（这非常慢）。我希望保留数据的原始结构，同时传递它以便于访问。做这个的最好方式是什么？

我尝试使用以下代码：

print ('enter mgf file name')
mgf_file = str(raw_input())
from pyteomics import mgf
reader = []
with mgf.read(mgf_file) as temp_read:
    for things in temp_read:
        reader.update(things)


compo_reader(reader)

Answer 1

将它们放入列表并传递列表。

由于您没有向我们展示您的代码，我无法向您展示如何更改代码，但我可以向您展示一些虚假代码。

假设您有一个函数parser(f)从f读取一行并返回您向我们展示的字典之一，或None完成后的行。所以：

with open(filename, 'rb') as f:
    things = []
    while True:
        thing = parser(f)
        if not thing:
            break
        things.append(thing)

或者，更紧凑：

with open(filename, 'rb') as f:
    things = list(iter(partial(parser, f), None))

如果您正在使用已经是可迭代的解析器，例如csv.DictReader，那么它甚至更简单：

with open(filename, 'rb') as f:
    reader = csv.DictReader(f)
    things = list(reader)

但是你已经完成了，一旦你有了这些词典的列表，你就可以传递那个列表，迭代它等等。

对于您的特定代码，它看起来像mgf.read()对象是字典上的迭代器，就像csv.DictReader一样，所以它应该只是：

with mgf.read(mgf_file) as temp_read:
    reader = list(temp_read)

如果不是这样，你想要这样做：

reader = []
with mgf.read(mgf_file) as temp_read:
    for thing in temp_read:
        reader.append(thing)

换句话说，不是在每个新词典的dict上重复调用update，而是将append每个都列入列表。

将cts中的dicts复制到新的dict-python

1 个答案: