我有3个tsv
个文件,这些文件的日期略有不同。我需要根据日期将所有3个股票值编译成1 tsv
个文件。问题是3个文件的日期略有不同。例如,
Stock1:
23 july 2009 - 10.03
24 july 2009 - 10.07
25 july 2009 - (no value)
Stock2:
23 july 2009 - (no value)
24 july 2009 - 3.07
25 july 2009 - 3.10
Stock3:
23 july 2009 - 5.40
24 july 2009 - (no value)
25 july 2009 - 5.10
如您所见,有时没有可用的价值。我想进入:
compiledStocks:
Date: Stock1 Stock2 Stock3
23 july 2009 - 10.03, (no value), 5.40
24 july 2009 - 10.07, 3.07, (no value)
25 july 2009 - (no value), 3.10, 5.10
使用Python
循环遍历所有3个文件并将其编译为单个文件的最佳方法是什么?
答案 0 :(得分:0)
要回答如何迭代多个文件的问题,请使用fileinput.input()
。
with fileinput.input(files=('spam.txt', 'eggs.txt')) as f:
for line in f:
process(line)
答案 1 :(得分:0)
使用pandas
,正如您所提到的那样tsv
希望它有所帮助:
df1=pd.read_csv('filepath/stock1',sep='\t')
df
Out[31]:
0 1
0 23 july 2009 10.03
1 24 july 2009 10.07
2 25 july 2009 NaN
同样对于其他两个文件:
df2=pd.read_csv('filepath/stock2',sep='\t')
df2
Out[42]:
0 1
0 23 july 2009 NaN
1 24 july 2009 3.07
2 25 july 2009 3.10
df3=pd.read_csv('filepath/stock3',sep='\t')
df3
Out[43]:
0 1
0 23 july 2009 5.4
1 24 july 2009 NaN
2 25 july 2009 5.1
然后使用pandas merge:
In[56]:df4=df1.merge(df2,on=0,how='left').merge(df3,on=0,how='left').rename(columns={0:'Date','1_x':'Stock1','1_y':'Stock2',1:'Stock3'}).fillna('No value')
df4
Out[57]:
Date Stock1 Stock2 Stock3
0 23 july 2009 10.03 No value 5.4
1 24 july 2009 10.07 3.07 No value
2 25 july 2009 No value 3.1 5.1