调用apply时,Pandas传递错误的dtypes

时间:2014-10-16 22:34:03

标签: python pandas

我使用pandas.DataFrame.apply函数遇到了问题。

似乎将所有值都投放到bool,除非我通过添加新列来“触摸”DataFrame。无论我是使用基于行还是基于列的apply(即axis=0axis=1),都会发生这种情况。

我的直觉告诉我,我在这里做了一些非常错误的事情,但我无法理解问题所在。

from datetime import datetime, timedelta
import pandas as pd

start_date = datetime(2014, 1, 1)
end_date = datetime(2014, 1, 3)

events = pd.DataFrame({
    "some_boolean_field": True,
    "timestamp": pd.date_range(start_date, end_date, freq='D')
})

def do_stuff(event):
    print event
    print ""

def run_experiment(message, df):
    print message
    print "**********************************"
    print df
    print df.dtypes
    df.apply(do_stuff, axis=1)
    print "\n\n"

run_experiment("BEFORE ADDING EXTRA FIELD", events)

events['foo'] = "WTF"  # Insane hack to get pandas to pass the correct row dtypes when applying the `do_stuff` function.
run_experiment("AFTER ADDING EXTRA FIELD", events)

输出:

BEFORE ADDING EXTRA FIELD
**********************************
  some_boolean_field  timestamp
0               True 2014-01-01
1               True 2014-01-02
2               True 2014-01-03
some_boolean_field              bool
timestamp             datetime64[ns]
dtype: object
some_boolean_field    True
timestamp             True
Name: 0, dtype: bool

some_boolean_field    True
timestamp             True
Name: 1, dtype: bool

some_boolean_field    True
timestamp             True
Name: 2, dtype: bool




AFTER ADDING EXTRA FIELD
**********************************
  some_boolean_field  timestamp  foo
0               True 2014-01-01  WTF
1               True 2014-01-02  WTF
2               True 2014-01-03  WTF
some_boolean_field              bool
timestamp             datetime64[ns]
foo                           object
dtype: object
some_boolean_field                   True
timestamp             2014-01-01 00:00:00
foo                                   WTF
Name: 0, dtype: object

some_boolean_field                   True
timestamp             2014-01-02 00:00:00
foo                                   WTF
Name: 1, dtype: object

some_boolean_field                   True
timestamp             2014-01-03 00:00:00
foo                                   WTF
Name: 2, dtype: object
  • pandas版本:0.14.1
  • numpy版本:1.8.1
  • 操作系统:Mac OS X 10.9.5

0 个答案:

没有答案