Question

I'm trying to read a csv file that looks like this:

              col 1             col 2             col 3      ...     col N
0        0 days 00:00:16   0 days 00:00:07   0 days 00:01:02          NaN
.
.
.
15000    0 days 01:40:00         NaN               NaN       ...      NaN

What I've tried:

df = pd.read_csv('file.csv', sep=',', index_col=0, dtype=object)
df = df.applymap(lambda x: pd.to_timedelta(x))

but as I have a lot of columns and rows, it is somewhat slow. Is there a more proper way to do this?

Answer 1

parse_dates中的dtype或read_csv参数不支持

timedelta对象。这里有几个选择。

`apply` + `to_timedelta`

df = df.apply(pd.to_timedelta, errors='coerce')

或者，

for c in df.columns:
    df[c] = pd.to_timedelta(df[c], errors='coerce')

`pd.read_csv(..., converters=...)`

另一种选择是在加载时传递converters参数：

f = {i : pd.to_timedelta for i in range(N)}  # you can access columns by index
df = pd.read_csv('file.csv', sep=',', index_col=0, converters=f)

How to read csv with timedeltas and NaN?

1 个答案:

`apply` + `to_timedelta`

`pd.read_csv(..., converters=...)`

How to read csv with timedeltas and NaN?

1 个答案:

apply + to_timedelta

pd.read_csv(..., converters=...)

`apply` + `to_timedelta`

`pd.read_csv(..., converters=...)`