Question

我想限制我的输出，就像我使用.str.contains（）一样，但这次使用int。如何使用Pandas使这个代码与（普通）Python一起工作？

import pandas as pd

df = pd.DataFrame({
        'yearweek' : [201604, 201604, 201604, 201604, 201605, 201605, 201605, 201605, 201606, 201606, 201606],
        'manufacturer' : ['F', 'F', 'S', 'S', 'F', 'F', 'S', 'S', 'F', 'S', 'S'],
        'reprint_reason_id' : [1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1],
        'tot_volume' : [100, 150, 80, 90, 120, 98, 77, 250, 33, 110, 56]})

    df1 = df.groupby(by=['yearweek', 'manufacturer']) ['tot_volume'].sum()
    df2 = df1.reset_index()
    df3 = df2[df2['manufacturer'].str.contains('F') ]
    df4 = df3.reset_index()
    df5 = df4[df4['yearweek'].int.contains(201604)]
    print df5

Answer 1

我认为您可以使用boolean indexing：

df = pd.DataFrame({
        'yearweek' : [201604, 201604, 201604, 201604, 201605, 201605, 201605, 201605, 201606, 201606, 201606],
        'manufacturer' : ['F', 'F', 'S', 'S', 'F', 'F', 'S', 'S', 'F', 'S', 'S'],
        'reprint_reason_id' : [1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1],
        'tot_volume' : [100, 150, 80, 90, 120, 98, 77, 250, 33, 110, 56]})

#convert to int is not necessary in this sample
#df['yearweek'] = df['yearweek'].astype(int)

df1 = df.groupby(by=['yearweek', 'manufacturer'])['tot_volume'].sum().reset_index()
#or you can use
#df1 = df.groupby(by=['yearweek', 'manufacturer'], as_index=False)['tot_volume'].sum()
print df1
   yearweek manufacturer  tot_volume
0    201604            F         250
1    201604            S         170
2    201605            F         218
3    201605            S         327
4    201606            F          33
5    201606            S         166

print (df1['manufacturer'].str.contains('F')) & (df1['yearweek'] == 201604)
0     True
1    False
2    False
3    False
4    False
5    False
dtype: bool

df2 = df1[(df1['manufacturer'].str.contains('F')) & (df1['yearweek'] == 201604)]
print df2
   yearweek manufacturer  tot_volume
0    201604            F         250

Answer 2

您可以使用.astype(int)将其转换为integer。像这样df4[df4['yearweek'].astype(int).contains(201604)]

Answer 3

我使用此行来解决我的问题，以检查DataFrame列中的内容。它将“数据”列转换为字符串，然后检查内容

df2 = df1[df1['yearweek'].astype('str').str.contains('201604')]

Python Pandas如果列包含某个int

3 个答案: