首先,标题非常不清楚,但是没有什么比这更好的了。我会详细说明这个问题。
我发现自己使用pandas
数据帧做了很多例行程序。我需要只使用DataFrame的部分(一些列)工作一段时间,之后我想将这些列添加回去。我想到了一个想法=上下文管理器。但是我无法想出正确的实现(如果有的话......)。
import pandas as pd
import numpy as np
class ProtectColumns:
def __init__(self, df, protect_cols=[]):
self.protect_cols = protect_cols
# preserve a copy of the part we want to protect
self.protected_df = df[protect_cols].copy(deep=True)
# create self.df with only the part we want to work on
self.df = df[[x for x in df.columns if x not in protect_cols]]
def __enter__(self):
# return self, or maybe only self.df?
return self
def __exit__(self, *args, **kwargs):
# btw. do i need *args and **kwargs here?
# append the preserved data back to the original, now changed
self.df[self.protect_cols] = self.protected_df
if __name__ == '__main__':
# testing
# create random DataFrame
df = pd.DataFrame(np.random.randn(6,4), columns=list("ABCD"))
# uneccessary step
df = df.applymap(lambda x: int(100 * x))
# show it
print(df)
# work without cols A and B
with ProtectColumns(df, ["A", "B"]) as PC:
# make everything 0
PC.df = PC.df.applymap(lambda x: 0)
# this prints the expected output
print(PC.df)
然而,说我不想使用PC.df,但是df。我可以做df = PC.df,或者在with
内或之后复制。但有可能在内部处理这个问题,例如__exit__
方法?
# unchanged df
print(df)
with ProtectColumns(df, list("AB")) as PC:
PC.applymap(somefunction)
# df is now changed
print(df)
感谢您的任何想法!