我试图证明我的问题是什么。我真的不明白,为什么PyNative
<class 'datetime.datetime'>
对象被替换为Pandas
自定义对象<class 'pandas._libs.tslibs.timestamps.Timestamp'>
。
import typing
from dateutil.parser import parse
def _normalize_users_dataframe(row: pd.core.series.Series) -> pd.core.series.Series:
last_seen: typing.Union[str, datetime.datetime] = row.get('last_seen', '')
if last_seen:
last_seen = parse(last_seen)
row['last_seen'] = last_seen
print(row['last_seen'][0].__class__.__mro__) # This shows me that, it is <class 'datetime.datetime'> object, which is PyNative datetime.
return row
def process_users_dataframe(filepath: str) -> pd.core.frame.DataFrame:
df: pd.core.frame.DataFrame = pd.read_csv(filepath, sep='\t')
df.rename(columns=mapping, inplace=True)
df.replace({np.nan: None}, inplace=True)
df = df.apply(_normalize_users_dataframe, axis=1)
print(row['last_seen'][0].__class__.__mro__) # This shows me that, it is <class 'pandas._libs.tslibs.timestamps.Timestamp'>, which is `Pandas` specific object.
return df
def main() -> None:
process_users_dataframe('<dir>')
在 normalize_users_dataframe()
函数中,当我尝试 print
last_seen
列系列时,它显示 dtype
是 <class 'datetime.datetime'>
,这很好,但是在 apply()
上运行 DataFrame
方法返回新的 DataFrame
对象后,last_seen
dtype
变为 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
。
这是怎么发生的?也许深入的实现细节?