['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']
我的数据是一个充满了这些字符串的文件。前2个元素,取10和0,是样本的特征,例如10-1是不同的样本。
我想要的是一个字典,其中这两个元素,在本例中为10和0,是该格式下的名称10-0和10-0是下面解释的列表:
10-0 = [
[1915, 387, 1933, 402],
[3350, 387, 3407, 391],
[842, 505, 863, 521],
]
2-29相同,它将是包含2个列表的字典的另一个元素。我提到了https://docs.python.org/3/tutorial/datastructures.html但我要做的事情比他们的文件要复杂得多。
答案 0 :(得分:1)
考虑:
from collections import defaultdict
from pprint import pprint
str_map = str.maketrans("","", " []'\n") # Eliminate characters ' ', '[', ']', ''' and '\n'.
my_complicated_data = []
with open("path/to/my_complicated_file.txt", "r") as my_complicated_file:
for line in my_complicated_file:
line = line.translate(str_map)
line = line.split(",")
my_complicated_data.append(line)
my_dict = defaultdict(list)
for row in my_complicated_data:
my_dict["-".join(row[:2])].append(row[2:])
pprint(my_dict)
输出:
defaultdict(<class 'list'>,
{'10-0': [['1915', '387', '1933', '402'],
['3350', '387', '3407', '391'],
['842', '505', '863', '521']],
'2-29': [['2986', '282', '3112', '300'],
['2753', '286', '2809', '297']]})
defaultdict是一个字典,其默认值由其创建中传递的函数执行,因此,例如,如果您创建d = defaultdict(int)
,d[5]
将输出0
。如果使用list
,则值为空列表[]
。对于更复杂的示例,如果您编写d = dafaultdict(lambda: [0 ,0])
,则默认值将是长度为2且其中包含0
的列表。
答案 1 :(得分:0)
试试这个:
t.txt
['10', '0', '1915', '387', '1933', '402']
['10', '0', '3350', '387', '3407', '391']
['10', '0', '842', '505', '863', '521']
['2', '29', '2986', '282', '3112', '300']
['2', '29', '2753', '286', '2809', '297']
程序:
file = open('t.txt','r')
l = []
for line in file:
l += [ eval(line) ]
d = {}
for i in l:
dkey = str(i[0]) + "-" + str(i[1])
dValue = i[2:]
if dkey in d:
cList = []
for i in d[dkey]:
cList.append(i)
cList.append(dValue)
d[dkey] = cList
else:
d[dkey] = []
d[dkey].append(dValue)
print(d)
输出:
{'10-0': [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], '2-29': [['2986', '282', '3112', '300'], ['2753', '286', '2809', '297']]}
在行动here
中查看答案 2 :(得分:0)
您可以在使用eval()read this之前使用类似的内容:
import ast
import itertools
final_=[]
with open('lifeu','r') as f:
for line in f:
final_.append(ast.literal_eval(line))
final__={}
for j,i in itertools.groupby(sorted(final_),lambda x:(x[0],x[1])):
final__[j]=list(map(lambda x:x[2:],list(i)))
print(final__)
输出:
{('10', '0'): [['1915', '387', '1933', '402'], ['3350', '387', '3407', '391'], ['842', '505', '863', '521']], ('2', '29'): [['2753', '286', '2809', '297'], ['2986', '282', '3112', '300']]}
答案 3 :(得分:-1)
这是你想做的吗? 的被修改强>
def get_input():
with open('input.txt','r') as file_input:
lines = file_input.read().split('\n')
result = []
for line in lines:
# check whether the line is not empty
if len(line)>0:
# for a personal script it's ok, but be carefull with this
result += [ eval(line) ]
return result
def format_input(lists_input):
result = {}
for data in lists_input:
if len(data)>1:
key = data[0]+'-'+data[1]
if not key in result:
result[ key ] = []
result[ key ] += [ data[2:] ]
return result
lists_input = get_input()
print format_input(lists_input)
答案 4 :(得分:-1)
你可以做这样的事情
data = [
['10', '0', '1915', '387', '1933', '402'],
['10', '0', '3350', '387', '3407', '391'],
['10', '0', '842', '505', '863', '521'],
['2', '29', '2986', '282', '3112', '300'],
['2', '29', '2753', '286', '2809', '297'],
]
output = dict()
for d in data:
key = str(d[0]) + "-" + str(d[1])
if key not in output:
output[key] = list()
output[key].append(d[2:])
print(output)