有一个txt文件(点击分隔),注释如下:
[header]
This file is generated by Bingbong Kim
user name : user
user gender : male
user age : 45
[data]
user_item color price
item1 red 10
item2 black 20
item3 green 10
item4 blue 15
我想再添加一条评论(#user country)并插入一个所有值相同的新列:
[header]
This file is generated by Bingbong Kim
user name : user
user gender : male
user age : 45
user country : US
[data]
user_item color price check
item1 red 10 true
item2 black 20 true
item3 green 10 true
item4 blue 15 true
现在我做了:
import pandas as pd
# save the comments in a list
header_list = []
temp_report = open('report.txt')
for i, line in enumerate(temp_report):
header_list.append(line)
if i == 7:
temp_report.close()
break
# create a dataframe without the comments
report = pd.read_csv('report.txt', skiprows=6, sep='\t')
# insert a column to the dataframe
report['check'] = "true" #all values are same
# write the comment first and then write the changed dataframe
f = open('new_report.txt', 'a')
for comment in header_list:
f.write(comment)
report.to_csv(f, sep='\t', index=False)
f.close()
我认为似乎有另一种好方法。每个txt文件大约需要50秒,因为txt文件太大了。我该如何改进呢?