Python 3.6。我有一个DataFrame,我减少到2列,文本和日期(类型日期时间)。我尝试使用此代码按行过滤特定时间:
import pandas as pd
laDataBrute = {'timestamp':['1519245127727', '1519246924475'],
'date creation': ['Wed Feb 21 20:32:07 +0000 2018', 'Wed Feb 21 21:02:04 +0000 2018' ],
'texte':['GE CFO says no plans for an equity raise', 'Baker Hughes rises after GE CFO signals plans']}
laDataBrute = pd.DataFrame(laDataBrute)
laDataBrute['date creation'] = pd.to_datetime(laDataBrute['timestamp'], unit='ms')
resultat = laDataBrute.loc[laDataBrute["texte"].str.contains(r'\bGE\b', regex=True) &
laDataBrute["date creation"].dt.hour == 21,
["texte","date creation"]]
print(resultat)
这是输出:
Empty DataFrame
Columns: [texte, date creation]
Index: []
不知道我做错了什么,谢谢!
答案 0 :(得分:1)
你需要做好准备:
laDataBrute['texte'].str.contains(r'\bGE\b') & (laDataBrute["date creation"].dt.hour == 21)
输出:
0 False
1 True
dtype: bool
对比你所拥有的:
laDataBrute['texte'].str.contains(r'\bGE\b') & laDataBrute["date creation"].dt.hour == 21
输出:
0 False
1 False
dtype: bool