我的问题是我有一个预处理程序从csv
读取数据并在2个客户端给定的字段(文档计数和检查总计)上进行协调,然后解析数据并计算在比较两者之前进行对比以获得和解。
首先,这是我的导入:
from csv import reader, writer, QUOTE_MINIMAL
import logging
from os import getcwd, mkdir, path
from sys import argv
from datetime import date
from types import IntType, FloatType
接下来,这是实际的对帐步骤:
def _recon_totals(self):
"""
Reconcile the check total amount and document count and write out the file name,
check numbers, vendor names, and timestamp to weekly report.
"""
# Client totals
client_doc_count = int(self.header_data[0][6])
client_check_tot = float(self.header_data[0][7])
# Double check variable typing for reconciliation totals.
logging.info('Document count is: {0}'.format(client_doc_count))
# doc_var_type = type(client_doc_count)
# assert doc_var_type is IntType, 'Doc count is not an integer: {0}'.format(
# doc_var_type)
logging.info('Check Total is: {0}'.format(client_check_tot))
# check_var_type = type(client_check_tot)
# assert check_var_type is FloatType, 'Check tot is not a float: {0}'.format(
# check_var_type)
# RRD totals
rrd_doc_count = 0
rrd_check_tot = 0.0
with open(self.rpt_of, 'a') as rpt_outfile:
for transact in self.transact_data:
row_type = transact[0]
logging.debug('Transaction type is: {0}'.format(row_type))
if row_type == 'P':
# Reconciliation
rrd_doc_count += 1
trans_chk_amt = float(transact[12])
# trans_chk_type = type(trans_chk_amt)
# assert trans_chk_type is FloatType, 'Transaction Check Total is '\
# 'not a float: {0}'.format(
# trans_chk_type)
rrd_check_tot += trans_chk_amt
# Reporting
vend_name = transact[2]
file_name = self.infile.split('/')[-1]
print('File name', file_name)
check_num = transact[9]
cur_time = date.today()
rpt_outfile.write('{0:<50}{1:<50}{2:<30}{3}\n'.format(file_name,
vend_name,
check_num,
cur_time))
# Reconcile totals and return the lists for writing if they are correct
# if (client_doc_count, client_check_tot) == (rrd_doc_count, rrd_check_tot):
# logging.info('Recon totals match!')
if client_doc_count == rrd_doc_count and client_check_tot == rrd_check_tot:
# logging.info('Recon totals match!')
return True
else:
raise ValueError('Recon totals do not match! Client: {0} {1} {2} {3}\n'
'RRD {4} {5} {6} {7}'.format(client_doc_count,
client_check_tot,
type(client_doc_count),
type(client_check_tot),
rrd_doc_count,
rrd_check_tot,
type(rrd_doc_count),
type(rrd_check_tot)))
我正在运行6个文件,其中4个运行正常(通过对帐),然后2个运行失败。这是正常的,客户端给我们提供了不好的数据,除了我在数据中找不到任何表明这是错误的事实。甚至我的堆栈调用也显示客户总数和我的总数应该协调:
ValueError: Recon totals do not match! Client: 2 8739.54 <type 'int'> <type 'float'>
RRD 2 8739.54 <type 'int'> <type 'float'>
我尝试了两种不同的方法来编写检查两者的语句,并得到相同的结果(预期)。
最后,这里有一个(修改的,相关字段除外)有问题的数据字段示例(这是带有计数的标题记录):
"H","XXX","XXX","XXX","XXX","XXX","2","8739.54","","","","","","","","","","","","","","","",""
然后是我和解的行:
"P","XXX","XXX","XXX","","XXX","XXX","XXX","XXX","XXX","XXX","XXX","846.80",...(more fields that aren't pertinent)
"P","XXX","XXX","XXX","","XXX","XXX","XXX","XXX","XXX","XXX","XXX","7892.74",...(more fields that aren't pertinent)
对于每个“P”记录,我增加了我的文档计数,然后我将非“XXX”字段添加到运行总计中。
总之,对此的任何帮助都将非常感激,我看不出我做出的任何逻辑错误。
答案 0 :(得分:2)
我不同意答案,暗示有误差。这是不可靠的(因为边距会随着你总结的浮动数量而变化)并且实际上似乎不是一个好的解决方案。这让我想起了电影Office Space,他们只是在交易过程中切掉几分钱,并将它们转移到另一个银行账户(你的错误边缘)。
然而我肯定会同意这项检查的建议,以确保使用减法确实这是一个浮点错误。
我会放弃浮动并使用decimal库。您需要做的就是用float
构造函数替换所有Decimal
构造函数:
from decimal import Decimal
def _recon_totals(self):
"""
Reconcile the check total amount and document count and write out the file name,
check numbers, vendor names, and timestamp to weekly report.
"""
# Client totals
client_doc_count = int(self.header_data[0][6])
client_check_tot = Decimal(self.header_data[0][7])
# Double check variable typing for reconciliation totals.
logging.info('Document count is: {0}'.format(client_doc_count))
# doc_var_type = type(client_doc_count)
# assert doc_var_type is IntType, 'Doc count is not an integer: {0}'.format(
# doc_var_type)
logging.info('Check Total is: {0}'.format(client_check_tot))
# RRD totals
rrd_doc_count = 0
rrd_check_tot = Decimal(0.0)
with open(self.rpt_of, 'a') as rpt_outfile:
for transact in self.transact_data:
row_type = transact[0]
logging.debug('Transaction type is: {0}'.format(row_type))
if row_type == 'P':
# Reconciliation
rrd_doc_count += 1
trans_chk_amt = Decimal(transact[12]) trans_chk_type)
rrd_check_tot += trans_chk_amt
# Reporting
vend_name = transact[2]
file_name = self.infile.split('/')[-1]
print('File name', file_name)
check_num = transact[9]
cur_time = date.today()
rpt_outfile.write('{0:<50}{1:<50}{2:<30}{3}\n'.format(file_name,
vend_name,
check_num,
cur_time))
# Reconcile totals and return the lists for writing if they are correct
# if (client_doc_count, client_check_tot) == (rrd_doc_count, rrd_check_tot):
# logging.info('Recon totals match!')
if client_doc_count == rrd_doc_count and client_check_tot == rrd_check_tot:
# logging.info('Recon totals match!')
return True
else:
raise ValueError('Recon totals do not match! Client: {0} {1} {2} {3}\n'
'RRD {4} {5} {6} {7}'.format(client_doc_count,
client_check_tot,
type(client_doc_count),
type(client_check_tot),
rrd_doc_count,
rrd_check_tot,
type(rrd_doc_count),
type(rrd_check_tot)))
小数通过将数字存储为基数10而不是像浮点数那样的基数2来工作。 Here是浮点不准确的一些例子。现在,由于我们所有的资金通常都是使用base-10进行交易,因此只使用基数10表示法来操纵它,而不是有效地转换为base-2然后再回到base-10。
答案 1 :(得分:0)
我不会依赖浮点相等检查来获取真实数据,因为浮点数学在各种奇怪的方式中都是不精确的。我建议首先确保这种差异是由浮点不精确引起的,通过打印您正在比较的两个值之间的差异,并确保它与您正在使用的数字相比非常非常小。然后我建议定义一个误差幅度,其中两个总数被认为是有效的;对于现实世界的钱来说,半分钱似乎是这种宽容的自然价值。