使用cast string-> float类型过滤数据

时间:2014-04-25 16:45:24

标签: python pandas

这里有一些问题,但我认为代码相对简单。

代码如下:

        import pandas as pd

        def establishAdjustmentFactor(df):
            df['adjFactor']=df['Adj Close']/df['Close'];
            df['chgFactor']=df['adjFactor']/df['adjFactor'].shift(1);
            return df;

        def yahooFinanceAccessor(ticker,year_,month_,day_):
            import datetime
            now = datetime.datetime.now()
            month = str(int(now.strftime("%m"))-1)
            day = str(int(now.strftime("%d"))+1)
            year  = str(int(now.strftime("%Y")))
            data = pd.read_csv('/Users/myDir/Downloads/' + ticker + '.csv');

            data['Date']=float(str(data['Date']).replace('-',''));
            data.set_index('Date')
            data=data.sort(['Date'],ascending=[1]);
            return data

        def calculateLongReturn(df):
            df['Ret']=df['Adj Close'].pct_change();
            return df;

        argStartYear = '2014';
        argStartMonth = '01';
        argStartDay='01';

        argEndYear = '2014';
        argEndMonth = '04';
        argEndDay = '30';

        #read data
        underlying = yahooFinanceAccessor("IBM,"1900","01","01");
        #Get one day return
        underlying = establishAdjustmentFactor(calculateLongReturn(underlying));

        #filter here
        underlying = underlying[(underlying['Date'] > long(argStartYear + argStartMonth +  argStartDay)) & underlying['Date']<long(argEndYear+argEndMonth+argEndDay)];

这将演变为函数,而argStart(End)将是函数的参数。

这个想法是会有一些父函数调用,它将保留底层的整个价格历史的全局数据帧,后来的调用将访问该数据帧并过滤所需的日期以查看是否存在拆分。

现在,当我读取数据并尝试转换read_csv调用时,我收到以下错误:

            Traceback (most recent call last):
              File "<stdin>", line 1, in <module>
              File "/Applications/Spyder.app/Contents/Resources/lib/python2.7/spyderlib/widgets/externalshell/sitecustomize.py", line 540, in runfile
                execfile(filename, namespace)
              File "/Users/myDir/Documents/PythonProjects/dailyOptionValuation.py", line 70, in <module>
                underlying = yahooFinanceAccessor("SVXY","1900","01","01");
              File "/Users/myDir/Documents/PythonProjects/dailyOptionValuation.py", line 37, in yahooFinanceAccessor
                data['Date']=float(str(data['Date']).replace('-',''));
            ValueError: invalid literal for float(): 0     20140424
            1     20140423
            2     20140422
            3     20140421
            4     20140417
            5     20140416
            6     20140415
            7     20140414
            8     20140411
            9     20140410
            10    20140409
            11    20140408
            12    20140407

任何关于为什么会非常有用的输入!

1 个答案:

答案 0 :(得分:0)

所以看起来好像我在稍微探讨了一下后发现了问题,并改变了我对这个问题的思考方式。

如果有更有效的方法,那么任何输入都会很棒。

    def operateOverSetToCreateEasyKey(df):
        for i in df.index:
            df.ix[i,'fmtDate']=int(str(df.ix[i]['Date']).replace('-',''));
        return df;