我试图找到" Measured_Power"的最小值,最大值,平均值。对于所有可能的组合率。我有很多费率和频率(10个频率,10个频率)。我的csv文件看起来像:
Channel, Rate, Length, Frequency, Expected_Power, Measured_Power, Expected_Eq, Measured_Eq,
A, 27, 1000, 100, 20, 20.16, <-23.0, -27.33,
A, 6, 1000, 100, 20, 20.12, <-23.0, -25.96,
A, 3, 1000, 100, 20, 20.05, <-23.0, -26.34,
A, 27, 1000, 101, 20, 20.11, <-23.0, -24.88,
A, 6, 1000, 101, 20, 20.26, <-23.0, -25.55,
A, 3, 1000, 101, 20, 20.08, <-23.0, -25.42,
B, 27, 1000, 100, 20, 20.5, <-23.0, -26.98,
B, 6, 1000, 100, 20, 20.21, <-23.0, -24.61,
B, 3, 1000, 100, 20, 20.17, <-23.0, -23.54,
...
我试过了:
import numpy
file = r'C:\data.csv'
c = numpy.genfromtxt(file,dtype='float',delimiter = ',',skiprows=1, skip_header=0, skip_footer=0, usecols=5,usemask=True)
print c.max()
print c.min()
我可以找到最大值和最小值,但是如何根据特定频道,速率和频率对其进行排序?任何帮助将是欣赏。 期望出来的Measured_Power:
Chanel, Rate, Max, Min, Average,
A, 3, .., .., ..,
A, 6, .., .., ..,
., ., .., .., ..,
., ., .., .., ..,
., ., .., .., ..,
A, 27,.., .., ..,
B, 3, .., .., ..,
B, 6, .., .., ..,
., ., .., .., ..,
., ., .., .., ..,
., ., .., .., ..,
B, 27,.., .., ..,
答案 0 :(得分:1)
我希望我理解你想要的东西。您希望获得Measured_Power
和Rate
的每种可能组合的最小值,最大值和平均值Frequency
,对吧?
嗯,你可以用熊猫快速做到这一点:
import pandas as pd
data = pd.read_csv('data_file.csv')
grouped_measured_power = data.groupby([' Rate', ' Frequency'])[' Measured_Power']
min_measured_power_by_rate_and_freq = grouped_measured_power.min()
max_measured_power_by_rate_and_freq = grouped_measured_power.max()
average_measured_power_by_rate_and_freq = grouped_measured_power.mean()
那就是它!请注意,我在列名前面放了一个空格,因为CSV文件中有空格,但您可能更喜欢格式化数据文件。
这里的记录是你的例子的输出
> min_measured_power_by_rate_and_freq
Rate Frequency
3 100 20.05
101 20.08
6 100 20.12
101 20.26
27 100 20.16
101 20.11
Name: Measured_Power, dtype: float64
> max_measured_power_by_rate_and_freq
Rate Frequency
3 100 20.05
101 20.08
6 100 20.21
101 20.26
27 100 20.50
101 20.11
Name: Measured_Power, dtype: float64
> average_measured_power_by_rate_and_freq
Rate Frequency
3 100 20.050
101 20.080
6 100 20.165
101 20.260
27 100 20.330
101 20.110
Name: Measured_Power, dtype: float64
结果是一个多索引结构......你也可能想要unstack it。
修改强>
记得你实际上可以通过同时应用多个聚合函数来做得更好,所以你可以这样做:
import pandas as pd
import numpy as np
data = pd.read_csv('data_file.csv')
grouped_measured_power = data.groupby([' Rate', ' Frequency'])[' Measured_Power']
result = grouped_measured_power.aggregate({'min': np.min,
'max': np.max,
'average': np.mean})
你会直接把所有东西放在一起:
> result
average max min
Rate Frequency
3 100 20.050 20.05 20.05
101 20.080 20.08 20.08
6 100 20.165 20.21 20.12
101 20.260 20.26 20.26
27 100 20.330 20.50 20.16
101 20.110 20.11 20.11