根据熊猫中的groupby shift创建新列

时间:2020-04-28 11:33:02

标签: pandas pandas-groupby

我有一个数据框,如下所示

Session      slot_num        Overbook     ID
s1           1               no           A
s1           2               no           B
s1           2               yes          B
s1           3               no           C
s1           4               no           D
s1           4               yes          D
s1           5               no           E
s1           6               no           F
s1           7               no           G
s2           1               no           A1
s2           2               no           B1
s2           2               no           C1
s2           3               yes          C1
s2           4               no           D1
s2           4               no           E1
s2           5               no           F1
s2           6               no           G1
s2           5               yes          G1

第一步: 从上面我想创建一列Overbook1,它是Overbook向上移动的版本,如下面的groupby会话所示。

现在我们得到

Session      slot_num        Overbook     ID    Overbook1
s1           1               no           A     no
s1           2               no           B     yes
s1           2               yes          B     no
s1           3               no           C     no
s1           4               no           D     yes
s1           4               yes          D     no
s1           5               no           E     no
s1           6               no           F     no
s1           7               no           G     no
s2           1               no           A1    yes   
s2           2               yes          A1    no
s2           2               no           K1    yes
s2           3               yes          K1    no
s2           4               no           D1    no
s2           4               no           L1    no
s2           5               no           S1    no
s2           6               no           G1    yes
s2           5               yes          G1    no

第2步:删除超额预订=='是'的行

最终预期输出:

Session      slot_num        Overbook     ID    Overbook1
s1           1               no           A     no
s1           2               no           B     yes
s1           3               no           C     no
s1           4               no           D     yes
s1           5               no           E     no
s1           6               no           F     no
s1           7               no           G     no
s2           1               no           A1    yes   
s2           2               no           K1    yes
s2           4               no           D1    no
s2           4               no           L1    no
s2           5               no           S1    no
s2           6               no           G1    yes

2 个答案:

答案 0 :(得分:1)

使用DataFrameGroupBy.shift过滤boolean indexing

    <template>
      <div class="details">
       
        <h1>This is details</h1>
        <h2>{{ product.brand }}</h2>

      </div>
    </template>
    <script>
    export default {
      name: 'detail',

      computed: {
        product() {
          return this.$store.getters.getProductById(this.$route.params.id);
        },
      },
    };
    </script>

    <style lang="scss" scoped>
    @import '../sass/components/shoeDetail';
    </style>

答案 1 :(得分:1)

第一步是创建一个新列:

CREATE TABLE [dbo].[dbo.usagetracker](
	[SNo] [nchar](10) NULL,
	[Username] [nvarchar](50) NULL,
	[LoginDate] [datetime2](7) NULL
) ON [PRIMARY]
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'1 ', N'Adhitya', CAST(N'2020-01-12T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'2 ', N'Selvam', CAST(N'2020-01-01T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'3 ', N'Kumar', CAST(N'2020-02-02T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'4 ', N'Adhitya', CAST(N'2020-02-02T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'5 ', N'Selvam', CAST(N'2020-02-12T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'6 ', N'Kumar', CAST(N'2020-02-02T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'7 ', N'Adhitya', CAST(N'2020-03-17T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'8 ', N'Selvam', CAST(N'2020-03-23T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'9 ', N'Kumar', CAST(N'2020-03-23T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'10', N'Kumar', CAST(N'2020-03-27T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'11', N'Kumar', CAST(N'2020-04-02T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'12', N'Kumar', CAST(N'2020-04-15T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'13', N' pal', CAST(N'2020-04-26T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'14', N' pal', CAST(N'2020-04-28T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'15', N'Adhitya', CAST(N'2020-04-28T00:00:00.0000000' AS DateTime2))
 
INSERT [dbo].[dbo.usagetracker] ([SNo], [Username], [LoginDate]) VALUES (N'16', N'Kumar', CAST(N'2020-01-15T00:00:00.0000000' AS DateTime2))

第二个-仅选择“所需”行:

df['Overbook1'] = df.groupby('Session').Overbook\
    .apply(lambda s: s.shift(-1, fill_value='no'))