按小时过滤DataFrame

时间:2018-03-15 12:54:35

标签: python python-3.x pandas dataframe

Python 3.6。我有一个DataFrame,我减少到2列,文本和日期(类型日期时间)。我尝试使用此代码按行过滤特定时间:

import pandas as pd

laDataBrute = {'timestamp':['1519245127727', '1519246924475'], 
        'date creation': ['Wed Feb 21 20:32:07 +0000 2018', 'Wed Feb 21 21:02:04 +0000 2018' ], 
        'texte':['GE CFO says no plans for an equity raise', 'Baker Hughes rises after GE CFO signals plans']}
laDataBrute = pd.DataFrame(laDataBrute)

laDataBrute['date creation'] = pd.to_datetime(laDataBrute['timestamp'], unit='ms')
resultat = laDataBrute.loc[laDataBrute["texte"].str.contains(r'\bGE\b', regex=True) &
                           laDataBrute["date creation"].dt.hour == 21, 
                           ["texte","date creation"]]
print(resultat)

这是输出:

Empty DataFrame
Columns: [texte, date creation]
Index: []

不知道我做错了什么,谢谢!

1 个答案:

答案 0 :(得分:1)

你需要做好准备:

laDataBrute['texte'].str.contains(r'\bGE\b') & (laDataBrute["date creation"].dt.hour == 21)

输出:

0    False
1     True
dtype: bool

对比你所拥有的:

laDataBrute['texte'].str.contains(r'\bGE\b') & laDataBrute["date creation"].dt.hour == 21

输出:

0    False
1    False
dtype: bool