I'm trying to read a csv file that looks like this:
col 1 col 2 col 3 ... col N
0 0 days 00:00:16 0 days 00:00:07 0 days 00:01:02 NaN
.
.
.
15000 0 days 01:40:00 NaN NaN ... NaN
What I've tried:
df = pd.read_csv('file.csv', sep=',', index_col=0, dtype=object)
df = df.applymap(lambda x: pd.to_timedelta(x))
but as I have a lot of columns and rows, it is somewhat slow. Is there a more proper way to do this?
答案 0 :(得分:3)
parse_dates
中的dtype
或read_csv
参数不支持timedelta对象。这里有几个选择。
apply
+ to_timedelta
df = df.apply(pd.to_timedelta, errors='coerce')
或者,
for c in df.columns:
df[c] = pd.to_timedelta(df[c], errors='coerce')
pd.read_csv(..., converters=...)
另一种选择是在加载时传递converters
参数:
f = {i : pd.to_timedelta for i in range(N)} # you can access columns by index
df = pd.read_csv('file.csv', sep=',', index_col=0, converters=f)