如何使用python生成ID号?

时间:2019-05-24 05:25:17

标签: python pandas

我有一个数据框。我想为每个人创建一个唯一的ID号,并根据人和日期(每周)创建一列。

import pandas as pd
df = pd.DataFrame({ 'name':['one','one','two','two','two','three','four'],
                     'date':['2019-05-01','2019-05-08','2019-05-01','2019-05-08','2019-05-15','2019-05-01','2019-05-15'],
                    "a":range(7)})
df['date'] = pd.to_datetime(df['date'],yearfirst=True)
df = df.sort_values(['name','date'])
print(df)

这是数据:

    name       date  a
6   four 2019-05-15  6
0    one 2019-05-01  0
1    one 2019-05-08  1
5  three 2019-05-01  5
2    two 2019-05-01  2
3    two 2019-05-08  3
4    two 2019-05-15  4

预期结果是

    name       date  a    id    week
6   four 2019-05-15  6     1    3
0    one 2019-05-01  0     2    1
1    one 2019-05-08  1     2    2
5  three 2019-05-01  5     3    1 
2    two 2019-05-01  2     4    1
3    two 2019-05-08  3     4    2
4    two 2019-05-15  4     4    3

如何获取“ id”和“ week”? 谢谢!

2 个答案:

答案 0 :(得分:1)

就像@ cs95一样,将GroupBy.ngroup与除以7numpy.ceil除以除法天数:

df["Id"] = df.groupby("name").ngroup() + 1
df['week'] = np.ceil(df.date.dt.day / 7).astype(int)
print (df)

    name       date  a  Id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3

或者:

df["Id"] = df.groupby("name").ngroup() + 1
df['week'] =  df.groupby("date").ngroup() + 1
print (df)

    name       date  a  Id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3

答案 1 :(得分:1)

我使用cumsum来获取df['id'],并使用groupby上的df.date来获取df['week']

df['id'] = df.name.ne(df.name.shift()).cumsum()
df['week'] = df.date.groupby(df.date).ngroup() + 1


Out[408]:
    name       date  a  id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3