Question

['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']

我的数据是一个充满了这些字符串的文件。前2个元素，取10和0，是样本的特征，例如10-1是不同的样本。

我想要的是一个字典，其中这两个元素，在本例中为10和0，是该格式下的名称10-0和10-0是下面解释的列表：

10-0 = [
    [1915, 387, 1933, 402],
    [3350, 387, 3407, 391],
    [842, 505, 863, 521],
 ]

2-29相同，它将是包含2个列表的字典的另一个元素。我提到了https://docs.python.org/3/tutorial/datastructures.html但我要做的事情比他们的文件要复杂得多。

Answer 1

考虑：

from collections import defaultdict
from pprint import pprint

str_map = str.maketrans("","", " []'\n")  # Eliminate characters ' ', '[',  ']', ''' and '\n'.
my_complicated_data = []
with open("path/to/my_complicated_file.txt", "r") as my_complicated_file:
    for line in my_complicated_file:
        line = line.translate(str_map)
        line = line.split(",")
        my_complicated_data.append(line)

my_dict = defaultdict(list)
for row in my_complicated_data:
    my_dict["-".join(row[:2])].append(row[2:])
pprint(my_dict)

输出：

defaultdict(<class 'list'>,
        {'10-0': [['1915', '387', '1933', '402'],
                  ['3350', '387', '3407', '391'],
                  ['842', '505', '863', '521']],
         '2-29': [['2986', '282', '3112', '300'],
                  ['2753', '286', '2809', '297']]})

defaultdict是一个字典，其默认值由其创建中传递的函数执行，因此，例如，如果您创建d = defaultdict(int)，d[5]将输出0。如果使用list，则值为空列表[]。对于更复杂的示例，如果您编写d = dafaultdict(lambda: [0 ,0])，则默认值将是长度为2且其中包含0的列表。

Answer 2

试试这个：

t.txt

['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']

程序：

file = open('t.txt','r')

l = []
for line in file:
   l += [ eval(line) ]

d = {}

for i in l:
    dkey = str(i[0]) + "-" + str(i[1])
    dValue = i[2:]
    if dkey in d:
        cList = []
        for i in d[dkey]:
            cList.append(i)
        cList.append(dValue)
        d[dkey] = cList
    else:
        d[dkey] = []
        d[dkey].append(dValue)
print(d)

输出：

{'10-0': [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], '2-29': [['2986', '282', '3112', '300'], ['2753', '286', '2809', '297']]}

在行动here

中查看

Answer 3

您可以在使用eval（）read this之前使用类似的内容：

import ast
import itertools
final_=[]
with open('lifeu','r') as f:
    for line in f:
        final_.append(ast.literal_eval(line))


final__={}
for j,i in itertools.groupby(sorted(final_),lambda x:(x[0],x[1])):
    final__[j]=list(map(lambda x:x[2:],list(i)))

print(final__)

输出：

{('10', '0'): [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], ('2', '29'): [['2753', '286', '2809', '297'], ['2986', '282', '3112', '300']]}

Answer 4

这是你想做的吗？的被修改

def get_input(): with open('input.txt','r') as file_input: lines = file_input.read().split('\n') result = [] for line in lines: # check whether the line is not empty if len(line)>0: # for a personal script it's ok, but be carefull with this result += [ eval(line) ] return result def format_input(lists_input): result = {} for data in lists_input: if len(data)>1: key = data[0]+'-'+data[1] if not key in result: result[ key ] = [] result[ key ] += [ data[2:] ] return result lists_input = get_input() print format_input(lists_input)

Answer 5

你可以做这样的事情

data = [
        ['10', '0', '1915', '387', '1933', '402'],
        ['10', '0', '3350', '387', '3407', '391'],
        ['10', '0', '842', '505', '863', '521'],
        ['2', '29', '2986', '282', '3112', '300'],
        ['2', '29', '2753', '286', '2809', '297'],
        ]


output = dict()

for d in data:
  key = str(d[0]) + "-" + str(d[1])

  if key not in output:
    output[key] = list()

  output[key].append(d[2:])

print(output)

如何正确列出我的数据？

5 个答案: