Question

我试图操纵我的数据而且我遇到了一些问题，我猜你们中的一些人会知道怎么做。

首先，我安排我的数据，如dict的列表：

data = [{'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 1, 'result' : 2.5} , {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 2, 'result' : 3.8}, {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 3, 'result' : 2.7}, {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 1, 'result' : 34.2} , {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 2, 'result' : 38.6}, {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 3, 'result' : 27.3}]

如您所见，更改值为方向，重复数字 n 和结果。

我尝试了这个新安排：

arrangeData = [{'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', n : [1,2,3], 'result' : [2.5, 3.8, 2.7]}, {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', n : [1,2,3], 'result' : [34.2, 38.6, 27.3]}]

正如您可能猜到的，我的dict真实数据列表包含几个复合，时间，温度

我的第一个愚蠢的假设是遍历每个元素：

for d in data:
    if d[0] == 'molecule1':
        if d[1] == 18:
            if d[2] == 20
          ...

但是它的编码很难，完全没有效率。

然后，我尝试使用每个值的列表：

compound = ['molecule1', 'molecule2', 'molecule3]
time = [18, 24]
temp = [20, 37]
orientation = ['top', 'bottom']

并再次循环每个列表：

for d in data:
    for c in compound:
        for t in time: 
            for tp in temp:
                for o in orientation: 
                   if d[0] == c:
                   ...

愚蠢，因为所有数据都在我的dict列表中，所以引入值列表似乎是错误的方法。

以下是问题：

我应该使用其他格式来存储每个条件和结果而不是字典吗？
如何检查dict的值并创建一个新的数据字典（如上面提到的arrangeData）？

编辑1

感谢Hai Vu正是我所寻找的！

Answer 1

从您提供的arrangeData示例中，您似乎想要将变量 n 和结果分组以用于复合的组合，时间， temp 和方向。

我不打算为你编写代码，但解释我会怎么做。我会写两个循环。第一个创建一个字典，其中包含元组（复合，时间， temp 和方向）的关键字，以及值 n 和结果作为增长列表。然后在第二个循环中，我将该数据结构转换为arrangeData的dicts格式列表。

看起来这是更大的代码库的一部分，也许你可以分享更多的上下文。甚至可能有一个更简单的解决方案来实现您的目标。

Answer 2

由于您只能有两个不同的方向值，因此该代码不仅可以工作。

但是如果你有太多的变化，那么这不是一个很好的解决方案。我宁愿制作两个词典列表而不是两个列表列表。

n_list = [[],[]]
result_list = [[],[]]

for i in data:
    if i['orientation'] == 'top':
        n_list[0].append(i['n'])
        result_list[0].append(i['result'])
    elif i['orientation'] == 'bottom':
        n_list[1].append(i['n'])
        result_list[1].append(i['result'])


for i in data:
    if i['orientation'] == 'top':
        i['n'] = n_list[0]
        i['result'] = result_list[0]
    elif i['orientation'] == 'top':
        i['n'] = n_list[1]
        i['result'] = result_list[1]


print data

如果您愿意，可以使用更短的解决方案：

n_list = {}
result_list = {}

for i in data:
    n_list.setdefault(i['orientation'], []).append(i['n'])
    result_list.setdefault(i['orientation'], []).append(i['result'])

for i in data:
    i['n'] = n_list[i['orientation']]
    i['result'] = result_list[i['orientation']]

输出：

[{
    'orientation': 'top',
    'temp': 20,
    'compound': 'molecule1',
    'n': [1, 2, 3],
    'result': [2.5, 3.8, 2.7],
    'time': 18
}, {
    'orientation': 'top',
    'temp': 20,
    'compound': 'molecule1',
    'n': [1, 2, 3],
    'result': [2.5, 3.8, 2.7],
    'time': 18
}, {
    'orientation': 'top',
    'temp': 20,
    'compound': 'molecule1',
    'n': [1, 2, 3],
    'result': [2.5, 3.8, 2.7],
    'time': 18
}, {
    'orientation': 'bottom',
    'temp': 20,
    'compound': 'molecule1',
    'n': 1,
    'result': 34.2,
    'time': 18
}, {
    'orientation': 'bottom',
    'temp': 20,
    'compound': 'molecule1',
    'n': 2,
    'result': 38.6,
    'time': 18
}, {
    'orientation': 'bottom',
    'temp': 20,
    'compound': 'molecule1',
    'n': 3,
    'result': 27.3,
    'time': 18
}]

Answer 3

我假设对于这些数据行，您希望按（复合，时间，温度和方向）对它们进行分组。如果不是这种情况，您可以在下面对我的代码进行更改。

这个想法是创建一个临时字典（out），其键是（复合，时间，温度和方向）的值，值是你所期望的：

{('molecule1', 18, 20, 'bottom'): {'compound': 'molecule1',
                                   'n': [1, 2, 3],
                                   'orientation': 'bottom',
                                   'result': [34.2, 38.6, 27.3],
                                   'temp': 20,
                                   'time': 18},
 ('molecule1', 18, 20, 'top'): {'compound': 'molecule1',
                                'n': [1, 2, 3],
                                'orientation': 'top',
                                'result': [2.5, 3.8, 2.7],
                                'temp': 20,
                                'time': 18}}

以下是代码：

from pprint import pprint

data = [
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 1, 'result' : 2.5} ,
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 2, 'result' : 3.8},
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'top', 'n' : 3, 'result' : 2.7},
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 1, 'result' : 34.2} ,
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 2, 'result' : 38.6},
    {'compound' : 'molecule1', 'time' : 18, 'temp' : 20, 'orientation' : 'bottom', 'n' : 3, 'result' : 27.3}
]

out = {}
for row in data:
    # Group the data by these columns that are the same
    key = (row['compound'], row['time'], row['temp'], row['orientation'])

    # This is the first time we encounter this row of data, copy most
    # values over and create empty lists for the 'n' and 'result'
    # column
    if key not in out:
        out[key] = row.copy()
        out[key]['n'] = []
        out[key]['result'] = []

    # Now we can append the 'n' and 'result' columns
    out[key]['n'].append(row['n'])
    out[key]['result'].append(row['result'])

# After we are done, we can obtain the arranged data
arrangeData = out.values()
pprint(arrangeData)

python new dict如果值匹配dict中的键值

3 个答案: