Python:计算从csv文件导入的两列的差异,并在python脚本中存储到另一列

时间:2018-03-19 05:15:33

标签: python python-3.x pandas data-analysis

我在python程序中导入了一个.csv文件,其中包含许多使用pandas模块的列。在我的代码中,我刚刚导入了前三列。代码和示例文件如下。

import pandas as pd
fields = ['TEST ONE', 'TEST TWO', 'TEST THREE']
df1=pd.read_csv('List.csv', skipinitialspace=True, usecols=fields)

示例文件

如何在我的python程序中找到 TEST ONE TEST TWO 列的差异,并将其存储在代码内的单独位置/列/数组中,以便可以在需要时从中提取值。我想找到新列的平均值和最大值,它是作为前两列的差异而生成的。

2 个答案:

答案 0 :(得分:2)

做这样的事情。

df1['diff'] =  df1['TEST ONE'] - df1['TEST TWO']
#The Dataframe would be df1 throughout
# This will store it as a column of that same dataframe.
# When you need the difference, use that column just like normal pandas column.
mean_of_diff = df1['diff'].mean()
max_of_diff = df1['diff'].max()
# For third value of difference use the third index of dataframe
third_diff = df1.loc[2, 'diff']

注意:我从0开始使用2作为索引。此外,index也可以是字符串或日期。传递适当的指数值以获得所需的结果。

答案 1 :(得分:1)

Difference = df1['TEST ONE'] - df['TEST TWO']

差异将是熊猫系列。你可以使用均值和最大值

Difference.mean()
Difference.max()