Pandas:Drop()int64基于值返回对象

时间:2016-06-12 14:22:08

标签: python pandas

我需要删除一列低于某个值的所有行。我使用下面的命令,但这会将列作为对象返回。我需要将其保留为int64

df["customer_id"] = df.drop(df["customer_id"][df["customer_id"] < 9999999].index)
df = df.dropna()

我之后尝试将该字段重新转换为int64,但是这会导致以下错误,来自完全不同的列的数据:

invalid literal for long() with base 10: '2014/03/09 11:12:27'

2 个答案:

答案 0 :(得分:1)

我认为boolean indexing需要reset_index

import pandas as pd

df = pd.DataFrame({'a': ['s', 'd', 'f', 'g'],
                'customer_id':[99999990, 99999997, 1000, 8888]})
print (df) 
   a  customer_id
0  s     99999990
1  d     99999997
2  f         1000
3  g         8888

df1 = df[df["customer_id"] > 9999999].reset_index(drop=True)
print (df1)
   a  customer_id
0  s     99999990
1  d     99999997

drop的解决方案,但速度较慢:

df2 = (df.drop(df.loc[df["customer_id"] < 9999999, 'customer_id'].index))
print (df2)
   a  customer_id
0  s     99999990
1  d     99999997

<强>计时

In [12]: %timeit df[df["customer_id"] > 9999999].reset_index(drop=True)
1000 loops, best of 3: 676 µs per loop

In [13]: %timeit (df.drop(df.loc[df["customer_id"] < 9999999, 'customer_id'].index))
1000 loops, best of 3: 921 µs per loop

答案 1 :(得分:0)

切割整个框架有什么问题(如有必要,还要重新编制索引)?

df = df[df["customer_id"] < 9999999]
df.index = range(0,len(df))