如何正确列出我的数据?

时间:2018-01-26 07:58:59

标签: python python-3.x python-2.7 list dictionary

['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']

我的数据是一个充满了这些字符串的文件。前2个元素,取10和0,是样本的特征,例如10-1是不同的样本。

我想要的是一个字典,其中这两个元素,在本例中为10和0,是该格式下的名称10-0和10-0是下面解释的列表:

10-0 = [
    [1915, 387, 1933, 402],
    [3350, 387, 3407, 391],
    [842, 505, 863, 521],
 ]

2-29相同,它将是包含2个列表的字典的另一个元素。我提到了https://docs.python.org/3/tutorial/datastructures.html但我要做的事情比他们的文件要复杂得多。

5 个答案:

答案 0 :(得分:1)

考虑:

from collections import defaultdict
from pprint import pprint

str_map = str.maketrans("","", " []'\n")  # Eliminate characters ' ', '[',  ']', ''' and '\n'.
my_complicated_data = []
with open("path/to/my_complicated_file.txt", "r") as my_complicated_file:
    for line in my_complicated_file:
        line = line.translate(str_map)
        line = line.split(",")
        my_complicated_data.append(line)

my_dict = defaultdict(list)
for row in my_complicated_data:
    my_dict["-".join(row[:2])].append(row[2:])
pprint(my_dict)

输出:

defaultdict(<class 'list'>,
        {'10-0': [['1915', '387', '1933', '402'],
                  ['3350', '387', '3407', '391'],
                  ['842', '505', '863', '521']],
         '2-29': [['2986', '282', '3112', '300'],
                  ['2753', '286', '2809', '297']]})

defaultdict是一个字典,其默认值由其创建中传递的函数执行,因此,例如,如果您创建d = defaultdict(int)d[5]将输出0。如果使用list,则值为空列表[]。对于更复杂的示例,如果您编写d = dafaultdict(lambda: [0 ,0]),则默认值将是长度为2且其中包含0的列表。

答案 1 :(得分:0)

试试这个:

t.txt

['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']

程序:

file = open('t.txt','r')

l = []
for line in file:
   l += [ eval(line) ]

d = {}

for i in l:
    dkey = str(i[0]) + "-" + str(i[1])
    dValue = i[2:]
    if dkey in d:
        cList = []
        for i in d[dkey]:
            cList.append(i)
        cList.append(dValue)
        d[dkey] = cList
    else:
        d[dkey] = []
        d[dkey].append(dValue)
print(d)

输出:

{'10-0': [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], '2-29': [['2986', '282', '3112', '300'], ['2753', '286', '2809', '297']]}

在行动here

中查看

答案 2 :(得分:0)

您可以在使用eval()read this之前使用类似的内容:

import ast
import itertools
final_=[]
with open('lifeu','r') as f:
    for line in f:
        final_.append(ast.literal_eval(line))


final__={}
for j,i in itertools.groupby(sorted(final_),lambda x:(x[0],x[1])):
    final__[j]=list(map(lambda x:x[2:],list(i)))

print(final__)

输出:

{('10', '0'): [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], ('2', '29'): [['2753', '286', '2809', '297'], ['2986', '282', '3112', '300']]}

答案 3 :(得分:-1)

这是你想做的吗? 的被修改

def get_input():
   with open('input.txt','r') as file_input:
      lines = file_input.read().split('\n')

   result = []
   for line in lines:
      # check whether the line is not empty
      if len(line)>0:
         # for a personal script it's ok, but be carefull with this
         result += [ eval(line) ]

   return result

def format_input(lists_input):
   result = {}

   for data in lists_input:
      if len(data)>1:
         key = data[0]+'-'+data[1]
         if not key in result:
            result[ key ] = []
         result[ key ] += [ data[2:] ]

   return result

lists_input = get_input()
print format_input(lists_input)

答案 4 :(得分:-1)

你可以做这样的事情

data = [
        ['10', '0', '1915', '387', '1933', '402'],
        ['10', '0', '3350', '387', '3407', '391'],
        ['10', '0', '842', '505', '863', '521'],
        ['2', '29', '2986', '282', '3112', '300'],
        ['2', '29', '2753', '286', '2809', '297'],
        ]


output = dict()

for d in data:
  key = str(d[0]) + "-" + str(d[1])

  if key not in output:
    output[key] = list()

  output[key].append(d[2:])

print(output)