Question

我正在尝试将每月数据编译到我通过import json加载的现有JSON文件中。最初，我的json数据只有一个属性是'name'：

json_data['features'][1]['properties']
>>{'name':'John'}

但是我想要的月度数据的最终结果是这样的：

json_data['features'][1]['properties']

>>{'name':'John',
'2016-01': {'x1':0, 'x2':0, 'x3':1, 'x4':0},
'2016-02': {'x1':1, 'x2':0, 'x3':1, 'x4':0}, ... }

我的月度数据是在单独的tsv文件中。他们有这种格式：

John    0    0    1    0
Jane    1    1    1    0

所以我通过import csv加载它们并通过URL列表解析并设置将它们放在集体字典中，如下所示：

file_strings = ['2016-01.tsv', '2016-02.tsv', ... ]
collective_dict = {}
for i in strings:
    with open(i) as f:
        tsv_object = csv.reader(f, delimiter='\t')
        collective_dict[i[:-4]] = rows[0]:rows[1:5] for rows in tsv_object

我通过切片collective_dict来检查事情的结果：

collective_dict['2016-01']['John'][0]
>>'0'

哪个是对的;它只需要被转换为整数。

对于我的下一个专长，我尝试将所有月度数据分配给相应的json成员作为其外部属性的一部分：

for i in file_strings:
    for j in range(len(json_data['features'])):
        json_data['features'][j]['properties'][i[:-4]] = {}
        json_data['features'][j]['properties'][i[:-4]]['x1'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][0])
        json_data['features'][j]['properties'][i[:-4]]['x2'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][1])
        json_data['features'][j]['properties'][i[:-4]]['x3'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][2])
        json_data['features'][j]['properties'][i[:-4]]['x4'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][3])

这里我有一个箭头指向最后几个字符：

语法错误：解析时意外的EOF

这是一个相当复杂的切片，我认为不排除用户错误。但是，我做了双重和三重检查。我也查了一下这个错误。它似乎提出了input()相关的电话。我有点困惑，我不知道我是怎么犯错的（虽然我已经心里准备接受了）。

我唯一的猜测是，某个地方不是一个字符串。当我检查collective_dict和json_data时，应该是字符串的所有内容都是字符串（'John'，'Jane'等等）。所以，我想这是别的。

我在保持数据原始结构和循环等方面尽可能简单地解决了问题。我正在使用Python 3.6。

问题

为什么我收到EOF错误？如何在不遇到此类错误的情况下构建外部属性数据？

Answer 1

在这里，我将您的最后一个代码块重写为：

for i in file_strings:
    file_name = i[:-4]
    for j in range(len(json_data['features'])):
        name = json_data['features'][j]['properties']['name']
        file_dict = json_data['features'][j]['properties'][file_name] = {}
        for x in range(4):
            x_string = 'x{}'.format(x+1)
            file_dict[x_string] = int(collective_dict[file_name][name][x])

从：

for i in file_strings:
    for j in range(len(json_data['features'])):
        json_data['features'][j]['properties'][i[:-4]] = {}
        json_data['features'][j]['properties'][i[:-4]]['x1'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][0])
        json_data['features'][j]['properties'][i[:-4]]['x2'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][1])
        json_data['features'][j]['properties'][i[:-4]]['x3'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][2])
        json_data['features'][j]['properties'][i[:-4]]['x4'] = int(collective_dict[i[:-4]][json_data['features'][j]['properties']['name']][3])

这只是为了让它更具可读性，但这不应该改变任何东西。

我在代码的其他部分注意到的一点是：

collective_dict[i[:-4]] = rows[0]:rows[1:5] for rows in tsv_object

我所指的是= rows[0]:rows[1:5] for rows in tsv_object部分。在我的IDE中，这不起作用，我不确定这是你的问题中的拼写错误还是实际上在你的代码中，但我想你想要它实际上是

collective_dict[i[:-4]] = {rows[0]:rows[1:5] for rows in tsv_object}

或类似的东西。我不确定这是否会让解析器认为文件末尾有错误。

ValueError：int（）

的文字无效

如果你的tsv数据是

John    0    0    1    0
Jane    1    1    1    0

然后执行字符串值int()应该没问题。例如：int('42')将成为值为42的int。但是，如果您的文件的一行或多行有错误，那么使用类似这样的代码块来确定它是哪个文件和行：

file_strings = ['2016-01.tsv', '2016-02.tsv', ... ]
collective_dict = {}
for file_name in file_strings:
    print('Reading {}'.format(file_name))
    with open(file_name) as f:
        tsv_object = csv.reader(f, delimiter='\t')
        for line_no, (name, *x_values) in enumerate(tsv_object):
            if len(x_values) != 4:
                print('On line {}, there is only {} values!'.format(line_no, len(x_values)))
            try:
                intx = [int(x) for x in x_values]
            except ValueError as e:
                # Catch "Invalid literal for int()"
                print('Line {}: {}'.format(line_no, e))

Dict Slice期间的EOF错误

1 个答案:

ValueError：int（）