我有一个数据框df,我想导出到json输出。我还需要对两列进行过滤,比如列a和列b。默认情况下,我需要导出整个数据框,但我还想将值作为可选变量传递到列a和列b,以仅导出某些数据。例如,当col a =“yes”和col b =“red”时,将其导出。我试过这个:df.to_json但我想知道如何过滤。 我怎样才能实现这一目标?我是熊猫和蟒蛇的新手,请提供更详细的解释。感谢任何帮助,非常感谢你!
答案 0 :(得分:0)
我认为您需要boolean indexing
&
(and
),但还需要检查是否需要所有值。解决方案是添加其他条件并使用|
(or
)进行链接。
df = pd.DataFrame({'col a':['yes','no','yes', 'yes'],
'col b':['red','green','orange','red']})
print (df)
col a col b
0 yes red
1 no green
2 yes orange
3 yes red
def filtering(a='ALL',b='ALL'):
m1 = df['col a'] == a
m2 = df['col b'] == b
m3 = a == 'ALL'
m4 = b == 'ALL'
return df[(m1|m3) & (m2|m4)].to_json()
print (filtering())
{"col a":{"0":"yes","1":"no","2":"yes","3":"yes"},
"col b":{"0":"red","1":"green","2":"orange","3":"red"}}
print (filtering('yes','red'))
{"col a":{"0":"yes","3":"yes"},"col b":{"0":"red","3":"red"}}
编辑:
按值列表进行过滤是类似的解决方案,只有条件发生了变化 - 需要isin
和in
(将ALL
更改为某些通用值永远不会出现在数据中):
def filtering(a=['ALL'],b=['ALL']):
m1 = df['col a'].isin(a)
m2 = df['col b'].isin(b)
m3 = 'ALL' in a
m4 = 'ALL' in b
return df[(m1|m3) & (m2|m4)].to_json()
print (filtering())
{"col a":{"0":"yes","1":"no","2":"yes","3":"yes"},
"col b":{"0":"red","1":"green","2":"orange","3":"red"}}
print (filtering(['yes'],['red', 'orange']))
{"col a":{"0":"yes","2":"yes","3":"yes"},
"col b":{"0":"red","2":"orange","3":"red"}}