下午所有,
我在下面的星号循环中遇到了一些奇怪的行为。下面的函数基本上遍历输入字典patient_features
,连接几个字符串以生成SVMLight样式向量。然后,该向量旨在写入可交付物文件。但是,出于某种原因,在星号for循环的每次迭代中都会调用函数末尾的写入,从而导致大量文件大小(以及一些其他更小的问题)。任何可能导致这种情况的帮助都将不胜感激。
def save_svmlight(patient_features, mortality, op_file, op_deliverable):
deliverable1 = open(op_file, 'wb') # feature without patient id
deliverable2 = open(op_deliverable, 'wb') # features with patient id
d1_line = ''
d2_line = ''
count = 0 # VALUE TO TEST IF INCREMENTING
print count
for patient_id in patient_features: #**********
value_tuple_list = patient_features[patient_id]
value_tuple_list.sort()
d2_line += str(int(patient_id)) + ' '
if patient_id in mortality:
d1_line += str(1) + ' '
d2_line += str(1) + ' '
else:
d1_line += str(0) + ' '
d2_line += str(0) + ' '
for value_tuple in value_tuple_list:
d1_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
d2_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
count += 1
print count # VALUE INCREMENTS WHEN IT SHOULD NOT
deliverable1.write(d1_line); # <- BEING WRITTEN TO EACH LOOP :(
deliverable2.write(d2_line); # <- BEING WRITTEN TO EACH LOOP :(
答案 0 :(得分:0)
问题在于在代码中使用缩进和制表符。如果你将两者混合使用,Python就不是粉丝了:/
为了将来参考,如果您使用Sublime Text,请选择所有文本,并在顶部的工具栏中转到View > Indentation > Convert Tabs to Spaces
,问题将得到解决。
以为我会省去你必须手动搜索每个标签并替换它:)