为什么不替换新数据框中的列?

时间:2017-08-05 17:28:46

标签: python pandas dataframe indexing sum

我有一个国家作为索引和年份(1990-2015)作为标题。我想制作一个新的df2,其中每列是5年的总和,例如:1995-1999,2000-2004等 我这样做了:

df2 = pd.DataFrame(index=df.index[:], columns=['1995', '2000', '2005', '2010', '2015'])
df2['1995'] = df.iloc[0:4].sum(axis=1)

但它不会取代NaN值。 我究竟做错了什么?提前致谢

3 个答案:

答案 0 :(得分:2)

第1步

使用df.T.reset_index

转置和重置索引
df2 = df.T.reset_index(drop=True)

第2步

使用df.groupby,按照5个为一组进行分组,然后与dfGroupBy.agg求和,并传递np.nansum

df2 = df2.groupby(df2.index // 5).agg(np.nansum).T

第3步

分配inplace

df2.columns = pd.to_datetime(df.columns[::5]).year  + 5
df = ... # Borrowed from Bharath

df2 = df.T.reset_index(drop=True)
df2 = df2.groupby(df2.index // 5).sum().T
df2.columns = pd.to_datetime(df.columns[::5]).year  + 5

print(df2)

输出:

         1995  2000  2005  2010
Country                        
IN         72    29   100     2
EG         31    40    40    24

答案 1 :(得分:1)

我认为您正在寻找特定列之后每5列的总和。一种方法是使用for循环在切片后合并数据,即如果你有数据帧

df = pd.DataFrame({'Country':['IN','EG'],'1990':[2,4],'1991':[4,5],'1992':[2,4],'1993':[2,4],'1994':[62,14],'1995':[21,4],'1996':[2,14],'1997':[2,4],'1998':[2,14],'1999':[2,4],'2000':[2,4],'2001':[2,14],'2002':[92,4],'2003':[2,4],'2004':[2,14],'2005':[2,24]})
df.set_index('Country',drop=True,inplace=True)
         1990  1991  1992  1993  1994  1995  1996  1997  1998  1999  2000  \
Country                                                                     
IN          2     4     2     2    62    21     2     2     2     2     2   
EG          4     5     4     4    14     4    14     4    14     4     4   

         2001  2002  2003  2004  2005  
Country                                
IN          2    92     2     2     2  
EG         14     4     4    14    24   

然后

df2 = pd.DataFrame(index=df.index[:])
columns=['1990','1995', '2000', '2005']
for x in columns:
    df2 = pd.concat([df2,df[df.columns[df.columns.tolist().index(x):][0:5]].sum(axis=1)],axis=1)

df2.columns= columns

输出:

         1990  1995  2000  2005
Country                        
IN         72    29   100     2
EG         31    40    40    24

如果您想设置不同的列,

df2.columns = ['1990-1994','1995-1999','1999-2004','2005-']

希望有所帮助

答案 2 :(得分:1)

您可以使用:

  • 转换列to_datetime
  • resampleaxis=1 5Ayears)和汇总sum
  • 列(df.columns = pd.to_datetime(df.columns, format='%Y') df2 = df.resample('5A',axis=1, closed='left').sum() df2.columns = df2.columns.year - 4 print (df2) 1990 1995 2000 2005 Country IN 72 29 100 2 EG 31 40 40 24
  • 最后从列DatetimeIndex.year获取数年并删除4
1

如果需要更改年份,也可以添加df.columns = pd.to_datetime(df.columns, format='%Y') df2 = df.resample('5A',axis=1, closed='left').sum() df2.columns = df2.columns.year + 1 print (df2) 1995 2000 2005 2010 Country IN 72 29 100 2 EG 31 40 40 24

public static function get_memberActionIDs($appName = NULL, $memRoleIDs = NULL)
{        
if($appName !== NULL && $memRoleIDs !== NULL)
{
    $tbl = 'app_'.$appName.'_roles';
    $memRoleID = explode(',',$memRoleIDs);
    $result = "";

    foreach($memRoleID as $value)
    {
        $db = openDB();
        $sql = $db->prepare("SELECT actionIDs FROM $tbl WHERE roleID = '$value'");

        if(!$sql->execute())
        {
            logThis('ERROR_crit' , 'Database Query Failed !!!' , 1 , __FILE__ , __LINE__);
            die('<h2>There was a critical error and data has not been loaded correctly. Developers have been notified.</h2><h3>Please try reloading the page</h3>');
        }
        else
        {
            // sql executed ok - bind fetch results
            $sql->bind_result($actionID);
            $sql->fetch();
            $result .= $actionID;
        }
    }// return all the actionIDs as 1 variable here
    print $result.'<br>';
}

}// end func