我想限制我的输出,就像我使用.str.contains()一样,但这次使用int。如何使用Pandas使这个代码与(普通)Python一起工作?
import pandas as pd
df = pd.DataFrame({
'yearweek' : [201604, 201604, 201604, 201604, 201605, 201605, 201605, 201605, 201606, 201606, 201606],
'manufacturer' : ['F', 'F', 'S', 'S', 'F', 'F', 'S', 'S', 'F', 'S', 'S'],
'reprint_reason_id' : [1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1],
'tot_volume' : [100, 150, 80, 90, 120, 98, 77, 250, 33, 110, 56]})
df1 = df.groupby(by=['yearweek', 'manufacturer']) ['tot_volume'].sum()
df2 = df1.reset_index()
df3 = df2[df2['manufacturer'].str.contains('F') ]
df4 = df3.reset_index()
df5 = df4[df4['yearweek'].int.contains(201604)]
print df5
答案 0 :(得分:1)
我认为您可以使用boolean indexing:
df = pd.DataFrame({
'yearweek' : [201604, 201604, 201604, 201604, 201605, 201605, 201605, 201605, 201606, 201606, 201606],
'manufacturer' : ['F', 'F', 'S', 'S', 'F', 'F', 'S', 'S', 'F', 'S', 'S'],
'reprint_reason_id' : [1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1],
'tot_volume' : [100, 150, 80, 90, 120, 98, 77, 250, 33, 110, 56]})
#convert to int is not necessary in this sample
#df['yearweek'] = df['yearweek'].astype(int)
df1 = df.groupby(by=['yearweek', 'manufacturer'])['tot_volume'].sum().reset_index()
#or you can use
#df1 = df.groupby(by=['yearweek', 'manufacturer'], as_index=False)['tot_volume'].sum()
print df1
yearweek manufacturer tot_volume
0 201604 F 250
1 201604 S 170
2 201605 F 218
3 201605 S 327
4 201606 F 33
5 201606 S 166
print (df1['manufacturer'].str.contains('F')) & (df1['yearweek'] == 201604)
0 True
1 False
2 False
3 False
4 False
5 False
dtype: bool
df2 = df1[(df1['manufacturer'].str.contains('F')) & (df1['yearweek'] == 201604)]
print df2
yearweek manufacturer tot_volume
0 201604 F 250
答案 1 :(得分:0)
您可以使用.astype(int)
将其转换为integer
。像这样df4[df4['yearweek'].astype(int).contains(201604)]
答案 2 :(得分:0)
我使用此行来解决我的问题,以检查DataFrame列中的内容。 它将“数据”列转换为字符串,然后检查内容
df2 = df1[df1['yearweek'].astype('str').str.contains('201604')]