Pandas数据帧,获取每行的最近日期

时间:2017-10-25 01:54:23

标签: python pandas

我试图浏览每一行(iterrow?)并找到最近的日期(排序函数?)并将其放在列G' G'

我在组合迭代功能和排序功能时遇到了麻烦。

Name

期望输出

    A       B           C           D           E           F           G
0   1       20171018    20171019    20171001    20171002    id_123      
1   2       NaN         20171005    20171006    20171003    id_234      
2   3       NaN         NaN         20171019    20171020    id_345      
3   4       NaN         NaN         NaN         20171021    id_456      

以下是生成数据框的代码

    A       B           C           D           E           F           G
0   1       20171018    20171019    20171001    20171002    id_123      20171019
1   2       NaN         20171005    20171006    20171003    id_234      20171006
2   3       NaN         NaN         20171019    20171020    id_345      20171020
3   4       NaN         NaN         NaN         20171021    id_456      20171021

编辑:我已经使用datetime

转换了日期列

1 个答案:

答案 0 :(得分:3)

您可以在数据框上使用.max()方法来获取最新日期。您需要传递参数axis=1以使其计算每行的最大值。

import pandas as pd

data = {'A': [1, 2, 3, 4],
        'B': ['20171018', '', '', ''],
        'C': ['20171019', '20171005', '', ''],
        'D': ['20171001', '20171006', '20171019', ''],
        'E': ['20171002', '20171003', '20171020', '20171021'],
        'F': ['id_123','id_234','id_345','id_456']
        }
df = pd.DataFrame(data)

# convert to datetimes
for c in 'BCDE':
    df[c] = pd.to_datetime(df[c])

# create a new column
df['G'] = df[['B','C','D','E']].max(axis=1)
print(df)

   A          B          C          D          E       F          G
0  1 2017-10-18 2017-10-19 2017-10-01 2017-10-02  id_123 2017-10-19
1  2        NaT 2017-10-05 2017-10-06 2017-10-03  id_234 2017-10-06
2  3        NaT        NaT 2017-10-19 2017-10-20  id_345 2017-10-20
3  4        NaT        NaT        NaT 2017-10-21  id_456 2017-10-21