查找列中间隔内的最高和最低值位置?

时间:2017-03-31 23:42:49

标签: pandas max grouping min series

鉴于这个pandas数据框有两列,'Values'和'Intervals'。如何获得第三列“MinMax”,指示该值是该区间内的最大值还是最小值?对我来说,挑战是间隔长度和间隔之间的距离不固定,因此我发布了这个问题。

+-------+----------+-----------+---------+
| index |  Value   | Intervals | Min/Max |
+-------+----------+-----------+---------+
|     0 | 1879.289 | np.nan    |         |
|     1 | 1879.281 | np.nan    |         |
|     2 | 1879.292 | 1         |         |
|     3 | 1879.295 | 1         |         |
|     4 | 1879.481 | 1         |         |
|     5 | 1879.294 | 1         |         |
|     6 | 1879.268 | 1         | min     |
|     7 | 1879.293 | 1         |         |
|     8 | 1879.277 | 1         |         |
|     9 | 1879.285 | 1         |         |
|    10 | 1879.464 | 1         |         |
|    11 | 1879.475 | 1         |         |
|    12 | 1879.971 | 1         |         |
|    13 | 1879.779 | 1         |         |
|    17 | 1879.986 | 1         |         |
|    18 | 1880.791 | 1         | max     |
|    19 |  1880.29 | 1         |         |
|    55 | 1879.253 | np.nan    |         |
|    56 | 1878.268 | np.nan    |         |
|    57 |  1875.73 | 1         |         |
|    58 | 1876.792 | 1         |         |
|    59 | 1875.977 | 1         | min     |
|    60 | 1876.408 | 1         |         |
|    61 | 1877.159 | 1         |         |
|    62 | 1877.187 | 1         |         |
|    63 | 1883.164 | 1         |         |
|    64 | 1883.171 | 1         |         |
|    65 | 1883.495 | 1         |         |
|    66 | 1883.962 | 1         |         |
|    67 | 1885.158 | 1         |         |
|    68 | 1885.974 | 1         | max     |
|    69 | 1886.479 | np.nan    |         |
|    70 | 1885.969 | np.nan    |         |
|    71 | 1884.693 | 1         |         |
|    72 | 1884.977 | 1         |         |
|    73 | 1884.967 | 1         |         |
|    74 | 1884.691 | 1         | min     |
|    75 | 1886.171 | 1         | max     |
|    76 | 1886.166 | np.nan    |         |
|    77 | 1884.476 | np.nan    |         |
|    78 |  1884.66 | 1         | max     |
|    79 | 1882.962 | 1         |         |
|    80 | 1881.496 | 1         |         |
|    81 | 1871.163 | 1         | min     |
|    82 | 1874.985 | 1         |         |
|    83 | 1874.979 | 1         |         |
|    84 | 1871.173 | np.nan    |         |
|    85 | 1871.973 | np.nan    |         |
|    86 | 1871.682 | np.nan    |         |
|    87 | 1872.476 | np.nan    |         |
|    88 | 1882.361 | 1         | max     |
|    89 | 1880.869 | 1         |         |
|    90 | 1882.165 | 1         |         |
|    91 | 1881.857 | 1         |         |
|    92 | 1880.375 | 1         |         |
|    93 |  1880.66 | 1         |         |
|    94 | 1880.891 | 1         |         |
|    95 | 1880.377 | 1         |         |
|    96 | 1881.663 | 1         |         |
|    97 |  1881.66 | 1         |         |
|    98 | 1877.888 | 1         |         |
|    99 |  1875.69 | 1         |         |
|   100 | 1875.161 | 1         | min     |
|   101 | 1876.697 | np.nan    |         |
|   102 | 1876.671 | np.nan    |         |
|   103 | 1879.666 | np.nan    |         |
|   111 | 1877.182 | np.nan    |         |
|   112 | 1878.898 | 1         |         |
|   113 | 1878.668 | 1         |         |
|   114 | 1878.871 | 1         |         |
|   115 | 1878.882 | 1         |         |
|   116 | 1879.173 | 1         | max     |
|   117 | 1878.887 | 1         |         |
|   118 |  1878.68 | 1         |         |
|   119 | 1878.872 | 1         |         |
|   120 | 1878.677 | 1         |         |
|   121 | 1877.877 | 1         |         |
|   122 | 1877.669 | 1         |         |
|   123 |  1877.69 | 1         |         |
|   124 | 1877.684 | 1         |         |
|   125 |  1877.68 | 1         |         |
|   126 | 1877.885 | 1         |         |
|   127 | 1877.863 | 1         |         |
|   128 | 1877.674 | 1         |         |
|   129 | 1877.676 | 1         |         |
|   130 | 1877.687 | 1         |         |
|   131 | 1878.367 | 1         |         |
|   132 | 1878.179 | 1         |         |
|   133 | 1877.696 | 1         |         |
|   134 | 1877.665 | 1         | min     |
|   135 | 1877.667 | np.nan    |         |
|   136 | 1878.678 | np.nan    |         |
|   137 | 1878.661 | 1         | max     |
|   138 | 1878.171 | 1         |         |
|   139 | 1877.371 | 1         |         |
|   140 | 1877.359 | 1         |         |
|   141 | 1878.381 | 1         |         |
|   142 | 1875.185 | 1         | min     |
|   143 | 1875.367 | np.nan    |         |
|   144 | 1865.492 | np.nan    |         |
|   145 | 1865.495 | 1         | max     |
|   146 | 1866.995 | 1         |         |
|   147 | 1866.672 | 1         |         |
|   148 | 1867.465 | 1         |         |
|   149 | 1867.663 | 1         |         |
|   150 | 1867.186 | 1         |         |
|   151 | 1867.687 | 1         |         |
|   152 | 1867.459 | 1         |         |
|   153 | 1867.168 | 1         |         |
|   154 | 1869.689 | 1         |         |
|   155 | 1869.693 | 1         |         |
|   156 | 1871.676 | 1         |         |
|   157 | 1873.174 | 1         | min     |
|   158 | 1873.691 | np.nan    |         |
|   159 | 1873.685 | np.nan    |         |
+-------+----------+-----------+---------+

在下面的第三列中,您可以看到每个区间的最大值和最小值。

{{1}}

2 个答案:

答案 0 :(得分:2)

isnull = data.iloc[:, 1].isnull()
minmax = data.groupby(isnull.cumsum()[~isnull])[0].agg(['idxmax', 'idxmin'])
data.loc[minmax['idxmax'], 'MinMax'] = 'max'
data.loc[minmax['idxmin'], 'MinMax'] = 'min'
data.MinMax = data.MinMax.fillna('')
print(data)

            0    1 MinMax
0    1879.289  NaN       
1    1879.281  NaN       
2    1879.292  1.0       
3    1879.295  1.0       
4    1879.481  1.0       
5    1879.294  1.0       
6    1879.268  1.0    min
7    1879.293  1.0       
8    1879.277  1.0       
9    1879.285  1.0       
10   1879.464  1.0       
11   1879.475  1.0       
12   1879.971  1.0       
13   1879.779  1.0       
14   1879.986  1.0       
15   1880.791  1.0    max
16   1880.290  1.0       
17   1879.253  NaN       
18   1878.268  NaN       
19   1875.730  1.0    min
20   1876.792  1.0       
21   1875.977  1.0       
22   1876.408  1.0       
23   1877.159  1.0       
24   1877.187  1.0       
25   1883.164  1.0       
26   1883.171  1.0       
27   1883.495  1.0       
28   1883.962  1.0       
29   1885.158  1.0       
..        ...  ...    ...
85   1877.687  1.0       
86   1878.367  1.0       
87   1878.179  1.0       
88   1877.696  1.0       
89   1877.665  1.0    min
90   1877.667  NaN       
91   1878.678  NaN       
92   1878.661  1.0    max
93   1878.171  1.0       
94   1877.371  1.0       
95   1877.359  1.0       
96   1878.381  1.0       
97   1875.185  1.0    min
98   1875.367  NaN       
99   1865.492  NaN       
100  1865.495  1.0    min
101  1866.995  1.0       
102  1866.672  1.0       
103  1867.465  1.0       
104  1867.663  1.0       
105  1867.186  1.0       
106  1867.687  1.0       
107  1867.459  1.0       
108  1867.168  1.0       
109  1869.689  1.0       
110  1869.693  1.0       
111  1871.676  1.0       
112  1873.174  1.0    max
113  1873.691  NaN       
114  1873.685  NaN       

[115 rows x 3 columns]

答案 1 :(得分:1)

data.columns=['Value','Interval']

data['Ingroup'] = (data['Interval'].notnull() + 0)

Use data['Interval'].notnull() to separate the groups...
Use cumsum() to number them with `groupno`...
Use groupby(groupno)..

Finally you want something using apply/idxmax/idxmin to label the max/min

But of course a for-loop as you suggested is the non-Pythonic but possibly simpler hack.
相关问题