使用Python在列的两个值之间获取行

时间:2018-06-07 11:38:53

标签: python python-3.x pandas dataframe

假设有一个数据框如下:

df = {
'Period': [1996,'Jan','Feb','March',1997,'Jan','Feb','March',1998,'Jan','Feb','March']
'Some-Values': [,'a','b','c',,'d','e','f',,'g',h','i']
}

需要提取值19961997之间的行,以便生成的数据框如下:

df_res = {
    'Period': ['Jan','Feb','March']
    'Some-Values': ['a','b','c']
}

我目前正在尝试Pandas,但无法找到解决方案。

2 个答案:

答案 0 :(得分:2)

尝试将数据框更改为“正确”方式,然后我们可以使用年份信息获取信息

df['Year']=df.loc[df['Some-Values']=='','Period']
df.Year=df.Year.ffill()
df=df.loc[df.Period!=df.Year,:]
df.loc[df.Year==1996,:]
Out[651]: 
  Period Some-Values  Year
1    Jan           a  1996
2    Feb           b  1996
3  March           c  1996

答案 1 :(得分:1)

通过pd.Series.idxmaxpd.DataFrame.iloc的一种方式:

df = pd.DataFrame({'Period': [1996,'Jan','Feb','March',1997,'Jan','Feb',
                              'March',1998,'Jan','Feb','March'],
                   'Some-Values': ['','a','b','c','','d','e','f','','g','h','i']})

res = df.iloc[(df['Period'] == 1996).idxmax()+1:(df['Period'] == 1997).idxmax()]

print(res)

  Period Some-Values
1    Jan           a
2    Feb           b
3  March           c

为了便于阅读,您可以使用slice对象:

slicer = slice((df['Period'] == 1996).idxmax()+1,
               (df['Period'] == 1997).idxmax())

res = df.iloc[slicer]