我在数据框中创建了一个新列,它将一列与另一列分开。然后我对数据帧进行了子集化。出于某种原因,新创建的列的子集化数据帧中的计算与原始数据帧中的计算不同。这是我的代码:
import pandas as pd
from pandas import DataFrame, Series
with open('mc2016_all.txt','r') as file_in: #reading item.csv file into Python
mc_subgroups = pd.read_table(file_in, delimiter = ',', header=0)
list(mc_subgroups.columns.values)
mc_subgroups['Percentage of Students'] = (mc_subgroups['Students Tested'])/(mc_subgroups['Total Tested At Entity Level'])
mc_math_EL= mc_subgroups[(mc_subgroups['Test Id']==2) & (mc_subgroups['Subgroup ID']==160) & (mc_subgroups['Grade']==13)]
mc_math_EL['Percentage Met']= pd.to_numeric(mc_math_EL['Percentage Standard Met'], errors='coerce')
mc_math_EL['Percentage Exceeded']= pd.to_numeric(mc_math_EL['Percentage Standard Exceeded'], errors='coerce')
mc_math_EL['Both Met and Exceeded'] = mc_math_EL['Percentage Exceeded'] + mc_math_EL['Percentage Met']
mc_math_EL1= mc_math_EL.sort_values(by='Both Met and Exceeded', ascending=False, na_position='last')
mc_math_EL2 = mc_math_EL1[['County Code','District Code', 'School Code','Subgroup ID','Total Tested At Entity Level','Grade','Test Id','Students Tested','Percentage Standard Exceeded', 'Percentage Standard Met', 'Both Met and Exceeded', 'Percentage of Students']].copy()
mc_math_EL2.to_csv("mc_math_EL.csv")
mc_subgroups['Percentage of Students']
列的输出如下所示:
Out[13]:
0 0.152082
1 0.151848
2 0.155345
3 0.155647
4 0.151159
5 0.150873
6 0.144203
7 0.144004
8 0.139117
9 0.139206
10 0.137732
11 0.137746
但mc_math_EL2['Percentage of Students']
的结果会产生所有1.0
:
Out[14]:
23314 1.0
22365 1.0
23606 1.0
1907 1.0
17751 1.0
8110 1.0
35354 1.0
8651 1.0
10042 1.0
9747 1.0
有没有办法冻结第一个数据框中的'Percentage of Students'
列,以便在mc_math_EL2
数据框中显示百分比。