我正在处理这个问题几个小时。这一定是一个小小的修复,但是我不知何故是盲目的。
This thread不能解决我的问题。
这是我的数据
Date Server
2019-02-13 A
2019-02-13 B
2019-02-13 B
2019-02-17 A
2019-02-17 B
2019-02-17 C
2019-02-19 C
2019-02-19 D
我需要获取相应日期范围内的服务器列表。我尝试了这段代码:
df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d').apply(lambda x: x.strftime(format='%Y-%m-%d'))
df = df.set_index(df['Date'])
### This formatting changes the cell content from a format like 20190217 to the
one represented above. Maybe there is already an error right here.###
start_date = pd.to_datetime('20190212', format='%Y%m%d').strftime(format='%Y-%m-%d')
end_date = pd.to_datetime('20190217', format='%Y%m%d').strftime(format='%Y-%m-%d')
但是,如果我明确地写出日期,则打印语句将提供正确的结果。但是,在我的程序中,我需要按开始日期和结束日期输入日期。
print(df[df.Date.between('2019-02-12','2019-02-17')].Server.unique())
print(df.loc['2019-02-12':'2019-02-17'].Server.unique())
print(df.loc[start_date : end_date].Server.unique())
输出:
['A' 'B' 'C'] - correct
['A' 'B' 'C'] - correct
['A' 'B' 'C' 'D'] - incorrect
我需要对代码进行哪些更改?
谢谢您的帮助!
答案 0 :(得分:1)
您无需制作strftime
并将格式更改为format='%Y-%m-%d'
import pandas as pd
df = pd.DataFrame({'Date': ['2019-02-13', '2019-02-13', '2019-02-13', '2019-02-17', '2019-02-17', '2019-02-17', '2019-02-19', '2019-02-19'],
'Server':['A','B','B','A','B','C','C','D']})
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')
df = df.set_index(df['Date'])
start_date = pd.to_datetime('20190212', format='%Y%m%d').strftime(format='%Y-%m-%d')
end_date = pd.to_datetime('20190217', format='%Y%m%d').strftime(format='%Y-%m-%d')
print(df[df.Date.between('2019-02-12','2019-02-17')].Server.unique())
print(df.loc['2019-02-12':'2019-02-17'].Server.unique())
print(df.loc[start_date : end_date].Server.unique())
输出为
['A' 'B' 'C']
['A' 'B' 'C']
['A' 'B' 'C']
答案 1 :(得分:1)
这应该可以解决问题。
import pandas as pd
start_date = '2019-02-12'
end_date = '2019-02-17'
df['Date'] = pd.to_datetime(df['Date'])
print(df.loc[(df['Date'] > start_date) & (df['Date'] <= end_date)].Server.unique())