Question

我正在寻找最简单直接的方法来返回值为“1”的数据框或列名列表。

说我从这开始：

<div class="box">

</div>

我想要一个像这样的数据帧。或者另一种更智能的方法是将值分组为“1”。

import pandas as pd 

dates = pd.date_range('1/1/2017', periods=4, freq='D')
df = pd.DataFrame({'W01': [0, 0, 0, 1], 'W02': [0, 1, 0, 0], 'W03': [0, 0, 0, 1]
              },
             index = dates)

df

           W01  W02 W03
2017-01-01  0   0   0
2017-01-02  0   1   0
2017-01-03  0   0   0
2017-01-04  1   0   1

或者，解决方案可以返回这样的列表吗？

           Value  X1    X2  
2017-01-01  1     NaN   NaN     
2017-01-02  1     W02   NaN
2017-01-03  1     NaN   NaN
2017-01-04  1     W01   W03

我的实际数据框有85列，差不多有700行。因此，解决方案应该能够匹配这些维度。

pandas的2017-01-01, NaN 2017-01-02, W02 2017-01-03, NaN 2017-01-04, W01, W03函数似乎没问题，但我无法弄明白：get_value

或者我可以使用lambda，但它不提供我正在寻找的所有信息。 df.get_value(dates, col="1")

帮助？

Answer 1

你可以

In [2784]: (df.apply(lambda x: ', '.join(x.index[x.astype(bool)]), axis=1)
              .replace('', np.nan))
Out[2784]:
2017-01-01         NaN
2017-01-02         W02
2017-01-03         NaN
2017-01-04    W01, W03
Freq: D, dtype: object

或者，

In [2787]: df.apply(lambda x: pd.Series(x.index[x.astype(bool)]), axis=1)
Out[2787]:
              0    1
2017-01-01  NaN  NaN
2017-01-02  W02  NaN
2017-01-03  NaN  NaN
2017-01-04  W01  W03

Answer 2

设置

df1=df.reset_index().melt('index')
df1=df1[df1.value.eq(1)]

1

df1.groupby('index')['variable'].apply(lambda x : ','.join(x)).to_frame().reindex(df.index)

Out[846]: 
           variable
2017-01-01      NaN
2017-01-02      W02
2017-01-03      NaN
2017-01-04  W01,W03

2

df1.groupby('index')['variable'].apply(lambda x : list(x)).apply(pd.Series).reindex(df.index)
Out[852]: 
              0    1
2017-01-01  NaN  NaN
2017-01-02  W02  NaN
2017-01-03  NaN  NaN
2017-01-04  W01  W03

返回具有特定值“1”的pandas数据帧

2 个答案: