相对简单的问题,我有这个市场数据http://pastebin.com/HmiMbux5
格式为[['data',high,open,low,close,volume等],['data',high,open,low,close,volume等]]
但是,当将.txt文件保存为.csv并加载到excel时,它会将所有数据放在顶行。
我可以使用什么python代码来获取[数据],[数据]并妥善保存它,因为[数据],[数据]用逗号分隔,它本来与excel兼容,错了。
感谢任何帮助。
问候
答案 0 :(得分:0)
清理像这样的东西
sed 's/\]\,\[/\n/g' HmiMbux5.txt | tr -d "[]" > HmiMbux5.csv
还有一个流浪的“Min to Spreed”,你可能不想要
答案 1 :(得分:0)
这是一个读取文件的脚本,并将其写入磁盘上的另一个文件。 只有这次是正确的CSV格式。我试着在代码中解释我的步骤,所以我希望它很清楚。
file = open("HmiMbux5.txt", "r")
data = file.readlines()[0] # this is still raw data.
file.close()
#now we need to purge the first and the last bracket.
index_lastbr = data.rfind("]")
intermediate_data = data[1:index_lastbr] # here I remove the encapsulating ->[<- and ->]<-
data_list_unformatted = intermediate_data.split("],[")
for index in range(len(data_list_unformatted)):
item = data_list_unformatted[index]
if "[" in item:
data_list_unformatted[index] = item[1:]
elif "]" in item:
data_list_unformatted[index] = item[:-1]
# the data is now ["'data',high, open, low, close, volume, etc","'data',high, open, low, close, volume, etc" ]
splitted_list = [] # this becomes a list of lists where every 'column' is a separate string
for item in data_list_unformatted:
splitted_list.append(item.split(","))
# so instead of ["'data',high, open, low, close, volume, etc","'data',high, open, low, close, volume, etc"]
# it is now [["'data'","high", "open", "low", "close", "volume", "etc"],["'data'","high", "open", "low", "close", "volume", "etc"]]
newfile = open("corrected_data.csv", "w") # here we start writing it to a file
newfile.write('"data";high;open;low;close;volume;etc;\n')
for listitem in splitted_list:
for item in listitem:
newfile.write(item + ";")
newfile.write("\n")
newfile.close()