如何通过多种条件选择某些列?

时间:2019-10-31 14:08:34

标签: python pandas dataframe

我已经用字典制作了一个数据框来对其进行操作。

dic_people = defaultdict(dict)

dic_people['A']['language']    = 'English'
dic_people['A']['nationality'] = 'Russia'
dic_people['A']['joined']      = 201010

dic_people['B']['language']    = 'French'
dic_people['B']['nationality'] = 'Canada'
dic_people['B']['joined']      = 201009

dic_people['C']['language']    = 'English'
dic_people['C']['nationality'] = 'Canada'
dic_people['C']['joined']      = 201008

dic_people['D']['language']    = 'French'
dic_people['D']['nationality'] = 'France'
dic_people['D']['joined']      = 201007

dic_people['E']['language']    = 'English'
dic_people['E']['nationality'] = 'Ireland'
dic_people['E']['joined']      = 201011

df = pd.DataFrame.from_dict(dic_people)

>>> df
                A       B        C       D        E
joined        201010  201009   201008  201007   201011
language     English  French  English  French  English
nationality   Russia  Canada   Canada  France  Ireland

我想选择2个人,他们1)最早加入,2)说英语。因此,结果将是

                A       C      
joined        201010  201008  
language     English  English 
nationality   Russia  Canada 

我想知道如何做。我尝试过df[ df.loc['language'] == 'English'],但在引用行而不是列时似乎有些不同。

1 个答案:

答案 0 :(得分:6)

使用DataFrame.loc按行进行过滤,并按位置DataFrame.iloc进行前两列的选择:

#if necessary sorting by 'joined' value in index
#df = df.sort_values('joined', axis=1)

df1 = df.loc[:,  df.loc['language'] == 'English'].iloc[:, :2]
print (df1)
                   A        C
language     English  English
nationality   Russia   Canada
joined        201010   201008