Pandas数据框与系列

时间:2015-12-22 15:00:11

标签: python numpy pandas

我有一个数据框和一系列我希望将滚动关联作为新数据框返回。

所以我在df1中有3列,我想返回一个新数据框,这是每个列与Series对象的滚动关联。

import pandas as pd

df1 = pd.read_csv('https://bpaste.net/raw/d0456d3a020b')
df1['Date'] = pd.to_datetime(df1['Date'])
df1 = df1.set_index(df1['Date'])
del df1['Date']


df2 = pd.read_csv('https://bpaste.net/raw/d5cb455cb091')
df2['Date'] = pd.to_datetime(df2['Date'])
df2 = df2.set_index(df2['Date'])
del df2['Date']


pd.rolling_corr(df1, df2)

结果https://bpaste.net/show/58b59c656ce4 只给出NaNs和1s

pd.rolling_corr(df1['IWM_Close'], spy, window=22)

给出了返回的理想系列,但我不想循环遍历数据框的列。有没有更好的方法呢?

感谢。

1 个答案:

答案 0 :(得分:1)

我认为您的第二个输入必须是public class ExcelUtils { private static XSSFSheet ExcelWSheet; private static XSSFWorkbook ExcelWBook; private static XSSFCell Cell; public static void setExcelFile(String Path) throws Exception { FileInputStream ExcelFile = new FileInputStream(Path); ExcelWBook = new XSSFWorkbook(ExcelFile); } public static String getCellData(int RowNum, int ColNum, String SheetName ) throws Exception{ ExcelWSheet = ExcelWBook.getSheet(SheetName); try{ Cell = ExcelWSheet.getRow(RowNum).getCell(ColNum); String CellData = Cell.getStringCellValue(); return CellData; }catch (Exception e){ return""; } } public static int getRowCount(String SheetName){ ExcelWSheet = ExcelWBook.getSheet(SheetName); int number=ExcelWSheet.getLastRowNum()+1; return number; } public static int getRowContains(String sTestCaseName, int colNum,String SheetName) throws Exception{ int i; ExcelWSheet = ExcelWBook.getSheet(SheetName); int rowCount = ExcelUtils.getRowCount(SheetName); for (i=0 ; i<rowCount; i++){ if (ExcelUtils.getCellData(i,colNum,SheetName).equalsIgnoreCase(sTestCaseName)){ break; } } return i; } // this is the method that I have trouble understanding, and its purpose public static int getTestStepsCount(String SheetName, String sTestCaseID, int iTestCaseStart) throws Exception{ for(int i=iTestCaseStart;i<=ExcelUtils.getRowCount(SheetName);i++){ if(!sTestCaseID.equals(ExcelUtils.getCellData(i, Constants.Col_TestCaseID, SheetName))){ int number = i; return number; } } ExcelWSheet = ExcelWBook.getSheet(SheetName); int number=ExcelWSheet.getLastRowNum()+1; return number; } } 才能与第一个Series中的所有columns相关联。

这有效:

DataFrame

或者,同样的结果:

index = pd.DatetimeIndex(start=date(2015,1,1), freq='W', periods = 100)
df1 = pd.DataFrame(np.random.random((100,3)), index=index)
df2 = pd.DataFrame(np.random.random((100,1)), index=index)
print(pd.rolling_corr(df1, df2.squeeze(), window=20).tail())

但是这并没有 - 请注意丢失的df2 = pd.Series(np.random.random(100), index=index) print(pd.rolling_corr(df1, df2, window=20).tail()) 0 1 2 2016-10-30 -0.170971 -0.039929 -0.091098 2016-11-06 -0.199441 0.000093 -0.096331 2016-11-13 -0.213728 -0.020709 -0.129935 2016-11-20 -0.075859 0.014667 -0.153830 2016-11-27 -0.114041 0.019886 -0.155472 - 仅关联匹配的.squeeze()

columns