我想在不触及pandas包的情况下向pandas添加自定义函数。我试着跟随:
extra_pandas.py
:
from pandas import *
class DataFrame2(pandas.core.frame.DataFrame):
def new_function(self):
print("I exist")
pandas.core.frame.DataFrame = DataFrame2
my_script.py
:
import extra_pandas as pd
df = pd.read_csv('example.csv')
print(df.new_function())
似乎没有用,我无法弄清楚出了什么问题。我收到以下错误:
AttributeError:' DataFrame'对象没有属性' new_function'
我错过了什么?
非常感谢
更新 我尝试了替代解决方案,并希望使用此代码段在循环中修补所有pandas函数:
patch_function = [read_csv, read_json, read_html, read_clipboard, read_excel,
read_hdf, read_feather, read_parquet, read_msgpack,
read_stata, read_sas, read_pickle, read_sql, read_gbq]
for func in patch_function:
orig_func = func
def patch(*args, **kwargs):
return DataFrame(orig_func(*args, **kwargs))
func = patch
但这不起作用。知道为什么吗?
由于
答案 0 :(得分:4)
你不能修补,但你可以替换:
extra_pandas.py
:
from pandas import *
class DataFrame2(DataFrame):
def new_function(self):
print("I exist")
DataFrame = DataFrame2
my_script.py
:
import extra_pandas as pd
df = pd.DataFrame(pd.read_csv('furniture.csv'))
print(df.new_function())
输出:
I exist
只需导入你自己的课程:
extra_pandas.py
:
import pandas as pd
class DataFrame2(pd.DataFrame):
def new_function(self):
print("I exist")
my_script.py
:
import pandas as pd
from extra_pandas import DataFrame2
df = DataFrame2(pd.read_csv('example.csv'))
print(df.new_function())
输出:
I exist
DataFrame
将另一个数据帧作为创建新数据帧的输入。
您尝试修补DataFrame
课程。这不起作用。这可能是由于主要是用Cython编写的事实,因此编译为C扩展。这会妨碍你尝试猴子补丁。
或猴子补丁read_csv()
。
extra_pandas.py
:
import pandas as pd
class DataFrame2(pd.DataFrame):
def new_function(self):
print("I exist")
orig_read_csv=pd.read_csv
def my_read_csv(*args, **kwargs):
return DataFrame2(orig_read_csv(*args, **kwargs))
pd.read_csv = my_read_csv
my_script.py
:
import pandas as pd
import extra_pandas
df = pd.read_csv('furniture.csv')
print(df.new_function())
输出:
I exist