对于每个y,如果列y符合条件,则计算列x的平均值

时间:2019-01-17 17:53:48

标签: python pandas dataframe

如何检索Z列的值及其平均值 如果任何值为> 1

data=[9,2,3,4,5,6,7,8]
df = pd.DataFrame(np.random.randn(8, 5),columns=['A', 'B', 'C', 'D','E'])
fd=pd.DataFrame(data,columns=['Z'])

df=pd.concat([df,fd], axis=1)

l=[]
for x,y in df.iterrows():
    for i,s in y.iteritems():
        if s >1:
            l.append(x)
            print(df['Z'])    

预期的输出很可能是字典,以列名作为键,Z的平均值作为其值。

3 个答案:

答案 0 :(得分:1)

你是这个意思吗?

df[df['Z']>1].loc[:,'Z'].mean(axis=0) 

df[df['Z']>1]['Z'].mean()

答案 1 :(得分:1)

使用字典理解:

res = {col: df.loc[df[col] > 1, 'Z'].mean() for col in df.columns[:-1]}
# {'A': 9.0, 'B': 5.0, 'C': 8.0, 'D': 7.5, 'E': 6.666666666666667}

用于以上设置:

np.random.seed(0)
data = [9,2,3,4,5,6,7,8]
df = pd.DataFrame(np.random.randn(8, 5),columns=['A', 'B', 'C', 'D','E'])
fd = pd.DataFrame(data, columns=['Z'])
df = pd.concat([df, fd], axis=1)

答案 2 :(得分:1)

我不知道我是否正确理解了您的问题,但是您的意思是:

import pandas as pd
import numpy as np

data=[9,2,3,4,5,6,7,8]
columns = ['A', 'B', 'C', 'D','E']
df = pd.DataFrame(np.random.randn(8, 5),columns=columns)
fd=pd.DataFrame(data,columns=['Z'])

df=pd.concat([df,fd], axis=1)
print('df = \n', str(df))

anyGreaterThanOne = (df[columns] > 1).any(axis=1)
print('anyGreaterThanOne = \n', str(anyGreaterThanOne))
filtered = df[anyGreaterThanOne]
print('filtered = \n', str(filtered))
Zmean = filtered['Z'].mean()
print('Zmean = ', str(Zmean))

结果:

    df = 
           A         B         C         D         E  Z
0 -2.170640 -2.626985 -0.817407 -0.389833  0.862373  9
1 -0.372144 -0.375271 -1.309273 -1.019846 -0.548244  2
2  0.267983 -0.680144  0.304727  0.302952 -0.597647  3
3  0.243549  1.046297  0.647842  1.188530  0.640133  4
4 -0.116007  1.090770  0.510190 -1.310732  0.546881  5
5 -1.135545 -1.738466 -1.148341  0.764914 -1.140543  6
6 -2.078396  0.057462 -0.737875 -0.817707  0.570017  7
7  0.187877  0.363962  0.637949 -0.875372 -1.105744  8

anyGreaterThanOne = 
 0    False
1    False
2    False
3     True
4     True
5    False
6    False
7    False
dtype: bool

filtered = 
           A         B         C         D         E  Z
3  0.243549  1.046297  0.647842  1.188530  0.640133  4
4 -0.116007  1.090770  0.510190 -1.310732  0.546881  5

Zmean =  4.5