Pandas数据帧中的逐行操作

时间:2017-11-29 06:25:18

标签: python pandas dataframe

我有一个具有此格式的世界指标数据集

country     year    indicatorName       value
USA         1970    Agricultural Land   ...
USA         1970    Crop production     ...
...
USA         2000    Agricultural Land   ...
USA         2000    Crop production     ...
...
Mexico      1970    Agricultural Land   ...
Mexico      1970    Crop production     ...
...
Mexico      2000    Agricultural Land   ...
Mexico      2000    Crop production     ...

这里有一些指标,我没有包含,但这两个是我感兴趣的。我想将value的{​​{1}} Crop production除以Agricultural Land每{{} 1}}每country。我们将结果命名为year

我不知道如何继续

crop_prod_density

如何从这里开始产生以下输出:

  1. 添加新行指示符
  2. df.groupby(['country', 'year'])

    1. 为分组(国家/地区,年份)的所有行添加具有相同值的新列
    2. country year indicatorName value USA 1970 Agricultural Land ... USA 1970 Crop production ... USA 1970 crop_prod_density ...

      1. 新数据框,只有此列的值
      2. country year indicatorName value crop_prod_density USA 1970 Agricultural Land ... us_value_1970 USA 1970 Crop production ... us_value_1970 ... Mexico 2000 Agricultural Land ... mx_value_2000 Mexico 2000 Crop production ... mx_value_2000

1 个答案:

答案 0 :(得分:2)

您可以先使用recordsset_index重新塑造,然后除以unstack

df1 = df.stack().reset_index(name='value')
print (df1)
   country  year      indicatorName  value
0   Mexico  1970  Agricultural Land   10.0
1   Mexico  1970    Crop production    5.0
2   Mexico  1970  crop_prod_density    0.5
3   Mexico  2000  Agricultural Land   10.0
4   Mexico  2000    Crop production    4.0
5   Mexico  2000  crop_prod_density    0.4
6      USA  1970  Agricultural Land   10.0
7      USA  1970    Crop production    2.0
8      USA  1970  crop_prod_density    0.2
9      USA  2000  Agricultural Land   10.0
10     USA  2000    Crop production    3.0
11     USA  2000  crop_prod_density    0.3

然后通过div重塑:

df2 =(df.set_index(['crop_prod_density'], append=True)
        .stack()
        .reset_index(name='value')
        .reindex(columns=['country','year','indicatorName','value','crop_prod_density']))
print (df2)
  country  year      indicatorName  value  crop_prod_density
0  Mexico  1970  Agricultural Land     10                0.5
1  Mexico  1970    Crop production      5                0.5
2  Mexico  2000  Agricultural Land     10                0.4
3  Mexico  2000    Crop production      4                0.4
4     USA  1970  Agricultural Land     10                0.2
5     USA  1970    Crop production      2                0.2
6     USA  2000  Agricultural Land     10                0.3
7     USA  2000    Crop production      3                0.3

对于新的原始列附加索引新列,但最后一行是stack所需的列更改顺序:

MultiIndex

最后删除不必要的列并从df3 = (df.drop(['Crop production','Agricultural Land'], axis=1) .reset_index() .rename_axis(None, 1)) print (df3) country year crop_prod_density 0 Mexico 1970 0.5 1 Mexico 2000 0.4 2 USA 1970 0.2 3 USA 2000 0.3 创建列:

var app=angular.module("app",[]);
app.controller("myCtrl",function FrmController($scope, $http) {
            $scope.errors = [];
            $scope.msgs = [];
            $scope.SignUp = function() {
                $scope.errors.splice(0, $scope.errors.length); // remove all error messages
                $scope.msgs.splice(0, $scope.msgs.length);
                $http.post('post_es.php', {'uname': 'testName', 'pswd': 'testPass', 'email': 'testEmail'}
                ).success(function(data, status, headers, config) {
                    if (data.msg != '')
                    {
                        $scope.msgs.push(data.msg);
                    }
                    else
                    {
                        $scope.errors.push(data.error);
                    }
                }).error(function(data, status) { // called asynchronously 
        if an error occurs
        // or server returns response with an error status.
                    $scope.errors.push(status);
                });
            }
        });