Question

我目前正在处理由以下数据组成的数据集：

paper_id，word_attributes，class_label

现在总共有3700个word_attributes列表示二进制值。问题是列标题尚未分配给数据集。那么如何在.csv文件中分配3700+列名呢？有什么建议吗？

感谢。

编辑： .csv文件如下;

100157,0,0,0,0,0,0,0,0,0,0,0,0,.....,Agents
100598,0,1,0,0,0,0,0,0,0,0,0,0,.....,IR
..............................
..............................

Answer 1

如何存储标题名称？

我会在Python（https://docs.python.org/2/library/csv.html）中使用CSV模块，但如果你已列出所有标题名称，然后您可以使用“加入”和只需将该行附加到文件顶部即可。

header_row = header_name_list.join(",")

file_to_read = open("your file path" , "r") 
old_content = file_to_read.read()
file_to_read.close()

content_to_write = "%s\n%s" % (header_row, old_content)
file_to_write = open("your file path" , "w")
file_to_write.write(content_to_write)
file_to_write.flush()
file_to_write.close()

将列名称分配给csv数据集

1 个答案: