Question

我正在尝试将一项功能应用于每种度量的不同读数。不用转换数据框就可以做到吗？

import random
import pandas as pd

df = pd.DataFrame({
    'index': sorted(['A', 'B']*3),
    'measure': [i for i in range(0,3)]*2,
    'reading': [random.random() for i in range(0,6)]
})


  index  measure   reading
0     A        0  0.260492
1     A        1  0.805028
2     A        2  0.548699
3     B        0  0.014042
4     B        1  0.719705
5     B        2  0.398824

如何为每个索引的不同读数应用诸如基本差异之类的功能？

在这里，我假定函数适用于读取0和1。由于我需要针对不同的度量值进行计算，因此它应该是调用的一部分。

所需的输出如下：

  index  applied
0     A  0.5445359999999999
1     B  0.705663

Answer 1

尝试一下

import random
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'index': sorted(['A', 'B']*3),
    'measure': [i for i in range(0,3)]*2,
    'reading': [random.random() for i in range(0,6)]
})

print(df)

# index  measure   reading
# 0     A        0  0.869707
# 1     A        1  0.120680
# 2     A        2  0.772035
# 3     B        0  0.565548
# 4     B        1  0.577074
# 5     B        2  0.290668

start = 0
stop = 1

# I decided to specify start and stop value separately, the absolute difference is 
# calculated via np.sum(). If the difference between start and stop is always 1, you 
# can omit the np.sum() call.

df = df.groupby('index').agg(applied=('reading', lambda x: np.sum(np.diff(x) 
[start:stop])))

print(df)

#         applied
# index
# A     -0.749027
# B      0.011526

将函数应用于熊猫DataFrame的多行

1 个答案: