Question

我在数据框中创建了一个新列，它将一列与另一列分开。然后我对数据帧进行了子集化。出于某种原因，新创建的列的子集化数据帧中的计算与原始数据帧中的计算不同。这是我的代码：

import pandas as pd 
from pandas import DataFrame, Series 

with open('mc2016_all.txt','r') as file_in: #reading item.csv file into Python
    mc_subgroups = pd.read_table(file_in, delimiter = ',', header=0)

list(mc_subgroups.columns.values)

mc_subgroups['Percentage of Students'] = (mc_subgroups['Students Tested'])/(mc_subgroups['Total Tested At Entity Level'])

mc_math_EL= mc_subgroups[(mc_subgroups['Test Id']==2) & (mc_subgroups['Subgroup ID']==160) & (mc_subgroups['Grade']==13)]

mc_math_EL['Percentage Met']= pd.to_numeric(mc_math_EL['Percentage Standard Met'], errors='coerce')

mc_math_EL['Percentage Exceeded']= pd.to_numeric(mc_math_EL['Percentage Standard Exceeded'], errors='coerce')

mc_math_EL['Both Met and Exceeded'] = mc_math_EL['Percentage Exceeded'] + mc_math_EL['Percentage Met']

mc_math_EL1= mc_math_EL.sort_values(by='Both Met and Exceeded', ascending=False, na_position='last')

mc_math_EL2 = mc_math_EL1[['County Code','District Code', 'School Code','Subgroup ID','Total Tested At Entity Level','Grade','Test Id','Students Tested','Percentage Standard Exceeded', 'Percentage Standard Met', 'Both Met and Exceeded', 'Percentage of Students']].copy()

mc_math_EL2.to_csv("mc_math_EL.csv")

mc_subgroups['Percentage of Students']列的输出如下所示：

Out[13]: 
0        0.152082
1        0.151848
2        0.155345
3        0.155647
4        0.151159
5        0.150873
6        0.144203
7        0.144004
8        0.139117
9        0.139206
10       0.137732
11       0.137746

但mc_math_EL2['Percentage of Students']的结果会产生所有1.0：

Out[14]: 
23314    1.0
22365    1.0
23606    1.0
1907     1.0
17751    1.0
8110     1.0
35354    1.0
8651     1.0
10042    1.0
9747     1.0

有没有办法冻结第一个数据框中的'Percentage of Students'列，以便在mc_math_EL2数据框中显示百分比。

在对数据框进行子集化时如何冻结pandas数据帧列？

0 个答案: