使用SQLAlchemy我已经定义了自己的TypeDecorator,用于在数据库中将pandas DataFrames存储为JSON字符串。
class db_JsonEncodedDataFrameWithTimezone(db.TypeDecorator):
impl = db.Text
def process_bind_param(self, value, dialect):
if value is not None and isinstance(value, pd.DataFrame):
timezone = value.index.tz.zone
df_json = value.to_json(orient="index")
data = {'timezone': timezone, 'df': df_json, 'index_name': value.index.name}
value = json.dumps(data)
return value
def process_result_value(self, value, dialect):
if value is not None:
data = json.loads(value)
df = pd.read_json(data['df'], orient="index")
df.index = df.index.tz_localize('UTC')
df.index = df.index.tz_convert(data['timezone'])
df.index.name = data['index_name']
value = df
return value
这适用于首次数据库保存,加载也很好。
问题出现在我增加值时,即更改DataFrame并尝试更改数据库。当我调用
时db.session.add(entity)
db.session.commit()
我得到一个回溯指向比较值是问题:
x == y
ValueError: Can only compare identically-labeled DataFrame Objects.
所以我怀疑我的问题与强制比较器有关。我尝试了三件事,都失败了,我真的不知道接下来要做什么:
#1st failed solution attempt inserting
coerce_to_is_types = (pd.DataFrame,)
#2nd failed solution attempt inserting
def coerce_compared_value(self, op, value):
return self.impl.coerce_compared_value(op, value)
#3rd failed solution attempt
class comparator_factory(db.Text.comparator_factory):
def __eq__(self, other):
try:
value = (self == other).all().all()
except ValueError:
value = False
return value
答案 0 :(得分:0)
在我的第四次尝试中,我认为我找到了答案,我直接创建了自己的比较函数,我在上面的Type类中插入了它。这避免了操作符'x == y'在我的DataFrame上执行:
def compare_values(self, x, y):
from pandas.util.testing import assert_frame_equal
try:
assert_frame_equal(x, y, check_names=True, check_like=True)
return True
except (AssertionError, ValueError, TypeError):
return False
这种性质的另一个问题后来出现在我的代码中。解决方案是修改上述内容以首先尝试自然比较,如果失败则执行上述操作:
try:
value = x == y
except:
# some other overwriting comparision method such as above