我已经用字典制作了一个数据框来对其进行操作。
dic_people = defaultdict(dict)
dic_people['A']['language'] = 'English'
dic_people['A']['nationality'] = 'Russia'
dic_people['A']['joined'] = 201010
dic_people['B']['language'] = 'French'
dic_people['B']['nationality'] = 'Canada'
dic_people['B']['joined'] = 201009
dic_people['C']['language'] = 'English'
dic_people['C']['nationality'] = 'Canada'
dic_people['C']['joined'] = 201008
dic_people['D']['language'] = 'French'
dic_people['D']['nationality'] = 'France'
dic_people['D']['joined'] = 201007
dic_people['E']['language'] = 'English'
dic_people['E']['nationality'] = 'Ireland'
dic_people['E']['joined'] = 201011
df = pd.DataFrame.from_dict(dic_people)
>>> df
A B C D E
joined 201010 201009 201008 201007 201011
language English French English French English
nationality Russia Canada Canada France Ireland
我想选择2个人,他们1)最早加入,2)说英语。因此,结果将是
A C
joined 201010 201008
language English English
nationality Russia Canada
我想知道如何做。我尝试过df[ df.loc['language'] == 'English']
,但在引用行而不是列时似乎有些不同。
答案 0 :(得分:6)
使用DataFrame.loc
按行进行过滤,并按位置DataFrame.iloc
进行前两列的选择:
#if necessary sorting by 'joined' value in index
#df = df.sort_values('joined', axis=1)
df1 = df.loc[:, df.loc['language'] == 'English'].iloc[:, :2]
print (df1)
A C
language English English
nationality Russia Canada
joined 201010 201008