遍历熊猫数据框并提取选择的列数据

时间:2020-07-16 02:48:46

标签: python pandas

我有一个熊猫数据框,可以基于多列值提取某些行。

代码以提取行,其中“ folder == True”列和“ depth == 1”列

folders = df[(df["folder"] == True) & (df['depth'] == 1)]

文件夹数据框

    id                                               path               mtime               ctime  folder  num_files  depth
17   2                           \\fileserver\bckup\admin 2020-07-10 16:36:58 2020-07-10 16:17:33    True       16.0      1
19  20                            \\fileserver\bckup\test 2020-07-10 16:19:33 2020-07-10 16:17:46    True        1.0      1

对于文件夹数据框,我想选择每行的路径和ctime值,并根据当前日期计算ctime,如果它超过X天数,则删除路径。我在遍历路径和ctime的数据帧时遇到困难,您能建议吗?

谢谢

1 个答案:

答案 0 :(得分:1)

假设下面的df是您的folder数据框,您可以这样做:

# todays date
today = pd.Timestamp('today')

# no. of days
x = 6

df['days_diff'] = (today - df['ctime']).dt.days

# set path to None days_diff > x
m = df['days_diff'].gt(x)
df.loc[m, 'path'] = None

cols = ['path', 'ctime', 'days_diff']
print(df[cols])

                                        path               ctime  days_diff
0                  \\fileserver\bckup\admin  2020-07-10 16:17:33          5
1                   \\fileserver\bckup\test  2020-07-10 16:17:46          5