Python - 根据日期重新映射值

时间:2016-07-18 13:02:38

标签: python

我有3个tsv个文件,这些文件的日期略有不同。我需要根据日期将所有3个股票值编译成1 tsv个文件。问题是3个文件的日期略有不同。例如,

Stock1:
23 july 2009 - 10.03
24 july 2009 - 10.07
25 july 2009 - (no value)

Stock2:
23 july 2009 - (no value)
24 july 2009 - 3.07
25 july 2009 - 3.10

Stock3:
23 july 2009 - 5.40
24 july 2009 - (no value)
25 july 2009 - 5.10

如您所见,有时没有可用的价值。我想进入:

compiledStocks:
Date:          Stock1       Stock2       Stock3 
23 july 2009 - 10.03,       (no value),  5.40
24 july 2009 - 10.07,       3.07,        (no value)
25 july 2009 - (no value),  3.10,        5.10

使用Python循环遍历所有3个文件并将其编译为单个文件的最佳方法是什么?

2 个答案:

答案 0 :(得分:0)

要回答如何迭代多个文件的问题,请使用fileinput.input()

with fileinput.input(files=('spam.txt', 'eggs.txt')) as f:
    for line in f:
        process(line)

答案 1 :(得分:0)

使用pandas,正如您所提到的那样tsv希望它有所帮助:

df1=pd.read_csv('filepath/stock1',sep='\t')
df
Out[31]: 
              0      1
0  23 july 2009  10.03
1  24 july 2009  10.07
2  25 july 2009    NaN

同样对于其他两个文件:

df2=pd.read_csv('filepath/stock2',sep='\t')
df2
Out[42]: 
              0     1
0  23 july 2009   NaN
1  24 july 2009  3.07
2  25 july 2009  3.10

df3=pd.read_csv('filepath/stock3',sep='\t')
df3
Out[43]: 
              0    1
0  23 july 2009  5.4
1  24 july 2009  NaN
2  25 july 2009  5.1

然后使用pandas merge

In[56]:df4=df1.merge(df2,on=0,how='left').merge(df3,on=0,how='left').rename(columns={0:'Date','1_x':'Stock1','1_y':'Stock2',1:'Stock3'}).fillna('No value')

df4
Out[57]: 
           Date    Stock1    Stock2    Stock3
0  23 july 2009     10.03  No value       5.4
1  24 july 2009     10.07      3.07  No value
2  25 july 2009  No value       3.1       5.1