我在python程序中导入了一个.csv文件,其中包含许多使用pandas模块的列。在我的代码中,我刚刚导入了前三列。代码和示例文件如下。
import pandas as pd
fields = ['TEST ONE', 'TEST TWO', 'TEST THREE']
df1=pd.read_csv('List.csv', skipinitialspace=True, usecols=fields)
示例文件
如何在我的python程序中找到 TEST ONE 和 TEST TWO 列的差异,并将其存储在代码内的单独位置/列/数组中,以便可以在需要时从中提取值。我想找到新列的平均值和最大值,它是作为前两列的差异而生成的。
答案 0 :(得分:2)
做这样的事情。
df1['diff'] = df1['TEST ONE'] - df1['TEST TWO']
#The Dataframe would be df1 throughout
# This will store it as a column of that same dataframe.
# When you need the difference, use that column just like normal pandas column.
mean_of_diff = df1['diff'].mean()
max_of_diff = df1['diff'].max()
# For third value of difference use the third index of dataframe
third_diff = df1.loc[2, 'diff']
注意:我从0开始使用2作为索引。此外,index也可以是字符串或日期。传递适当的指数值以获得所需的结果。
答案 1 :(得分:1)
Difference = df1['TEST ONE'] - df['TEST TWO']
差异将是熊猫系列。你可以使用均值和最大值
Difference.mean()
Difference.max()