在熊猫的平均两个相同格式的数据帧

时间:2020-01-19 19:07:52

标签: python pandas

我有两个从CSV文件加载的熊猫数据框。每个都有两列,列A是ID,并且在两个CSV中的值和顺序相同。 B列是一个数值。

我需要创建一个新的CSV,其A列与前两个相同,B列为两个初始CSV的平均值。

我正在创建两个像这样的数据框

private DrawerLayout drawerLayout;
private NavigationView navigationView;
private Toolbar toolbar;
public NavController navController;

@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);


    navController = Navigation.findNavController(this, R.id.nav_host_fragment);
    drawerLayout = findViewById(R.id.drawer_layout);
    navigationView = findViewById(R.id.nav_view);

    initDrawer();

    View header = navigationView.getHeaderView(0);
    View userPicture =  header.findViewById(R.id.user_pic);
    userPicture.setOnClickListener(new View.OnClickListener() {
        @Override
        public void onClick(View view) {
            showUserProfile();
        }
    });
}


@Override
public void onBackPressed() {
    DrawerLayout drawer = (DrawerLayout) findViewById(R.id.drawer_layout);

    if (drawer.isDrawerOpen(GravityCompat.START)) {
        drawer.closeDrawer(GravityCompat.START);
    } else {
        super.onBackPressed();
    }
}

@Override
public boolean onOptionsItemSelected(@NonNull MenuItem item) {
    DrawerLayout drawer = (DrawerLayout) findViewById(R.id.drawer_layout);

    switch (item.getItemId()){
        case android.R.id.home:
            if (drawer.isDrawerOpen(GravityCompat.START)) {
                drawer.closeDrawer(GravityCompat.START);
            } else if (navController.popBackStack()){
            } else {
                drawer.openDrawer(GravityCompat.START);
            }
            return true;
    }

    return super.onOptionsItemSelected(item);
}

@Override
public void onFragmentInteraction(Uri uri) {

}

private void initDrawer(){
    Toolbar toolBar = (Toolbar) findViewById(R.id.toolbar);
    setSupportActionBar(toolBar);
    ActionBar actionBar = getSupportActionBar();

    NavigationUI.setupActionBarWithNavController(this, navController, drawerLayout);
    NavigationUI.setupWithNavController(navigationView, navController);
}

private void showUserProfile(){
    if(drawerLayout.isDrawerOpen(GravityCompat.START)){
        drawerLayout.closeDrawer(GravityCompat.START);
    }

    navController.navigate(R.id.userProfileFragment);
}

如果我愿意

df1=pd.read_csv(path).set_index('A')
df2=pd.read_csv(otherPath).set_index('A')

然后newDF在A列中的ID顺序错误

如果我愿意

newDf = (df1['B'] + df2['B'])/2
newDf.to_csv(...)

第一行出现错误,提示“值错误:无法从重复的轴重新索引”

看来这应该是微不足道的,我在做什么错了?

1 个答案:

答案 0 :(得分:1)

尝试使用merge而不是设置索引。

即我们有以下数据框:

df1 = pd.DataFrame({"A" : [1, 2, 3, 4, 5], "B": [3, 4, 5, 6, 7]})
df2 = pd.DataFrame({"A" : [1, 2, 3, 4, 5], "B": [7, 4, 3, 10, 23]})

然后我们将它们合并,并创建一个同时包含两个B列的平均值的新列。

together = df1.merge(df2, on='A')
together.loc[:, "mean"] = (together['B_x']+ together['B_y']) / 2
together = together[['A', 'mean']]

一起是:

    A   mean
0   1   5.0
1   2   4.0
2   3   4.0
3   4   8.0
4   5   15.0