我认为这是一个非常简单的问题,但我找不到另一个类似案例得以解决的条目。
我有一个像这样的Pandas数据框:
group1 group2 meandiff lower upper reject
0 bacc dry_sed 2575.1697 2033.6713 3116.6681 True
1 bacc junc_hal -81.8513 -555.8132 392.1106 False
2 bacc other_trees -1.2333 -512.6246 510.1579 False
3 bacc phrag 613.2256 0.4309 1226.0204 True
4 bacc water -1074.4667 -1687.2614 -461.6719 True
5 bacc wet_sed -437.1854 -943.2217 68.8508 False
6 dry_sed junc_hal -2657.0210 -3068.3186 -2245.7234 True
7 dry_sed other_trees -2576.4030 -3030.3269 -2122.4792 True
8 dry_sed phrag -1961.9441 -2527.6677 -1396.2204 True
9 dry_sed water -3649.6364 -4215.3600 -3083.9127 True
10 dry_sed wet_sed -3012.3551 -3460.2374 -2564.4728 True
11 junc_hal other_trees 80.6179 -290.1464 451.3823 False
12 junc_hal phrag 695.0769 193.6165 1196.5373 True
13 junc_hal water -992.6154 -1494.0758 -491.1550 True
14 junc_hal wet_sed -355.3341 -718.6767 8.0084 False
15 other_trees phrag 614.4590 77.4825 1151.4354 True
16 other_trees water -1073.2333 -1610.2098 -536.2569 True
17 other_trees wet_sed -435.9521 -846.9253 -24.9788 True
18 phrag water -1687.6923 -2321.9951 -1053.3895 True
19 phrag wet_sed -1050.4111 -1582.2901 -518.5320 True
20 water wet_sed 637.2812 105.4022 1169.1603 True
我想在group1和group2之间创建一个列联表,但是在每个单元格中放入Reject列中的值。
看起来应该是这样的:
bacc dry_sed junc_hal other_trees phrag water wet_sed
bacc NA 1 0 0 1 1 0
dry_sed 1 NA 1 1 1 1 1
junc_hal 0 1 NA 0 1 1 0
other_trees 0 1 0 NA 1 1 1
phrag 1 1 1 1 NA 1 1
water 1 1 1 1 1 NA 1
wet_sed 0 1 0 1 1 1 NA
NA只是作为参考,可能有任何数字。
是否有直接的方式以这种方式汇总数据?在开始使用循环分析表之前,我想确保没有简单的直接方法来实现这一点。
提前致谢。
答案 0 :(得分:0)
您可以转动数据框。
df.pivot(index='group1', columns='group2', values='reject')
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc True False False True True False
dry_sed None True True True True True
junc_hal None None False True True False
other_trees None None None True True True
phrag None None None None True True
water None None None None None True
答案 1 :(得分:0)
假设您的数据框名为df
,您可以执行以下操作:
df['reject_flag'] = df['reject'].astype(int)
output = df.pivot_table(index='group1', columns='group2', values='reject_flag')
给出了以下内容:
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc 1.0 0.0 0.0 1.0 1.0 0.0
dry_sed NaN 1.0 1.0 1.0 1.0 1.0
junc_hal NaN NaN 0.0 1.0 1.0 0.0
other_trees NaN NaN NaN 1.0 1.0 1.0
phrag NaN NaN NaN NaN 1.0 1.0
water NaN NaN NaN NaN NaN 1.0