Python invoice2data:错误不可散列类型

时间:2021-01-17 12:26:51

标签: python

问题描述

我在使用包 invoice2data 时遇到了无法解决的错误。

当我为发票设置此模板时:

issuer: My Template
keywords:
- www.webok.com
- 123 4567 89
fields:
  amount: TOTAL\s+.(\d+\.\d+)
  date: Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
  invoice_number: Reference:\s(\w+)
  operator: Operators:\s(\w+)
options:
  currency: USD
  date_formats:
    - '%d/%m/%Y %G:%i'
  languages:
    - en
  decimal_separator: '.'
lines:
    start: Your Reference:+\s+\w+\n_+
    end: \s+_+\n+\s+TOTAL\s+.(\d+\.\d+)
    line: (?P<description>.+)\s+\((?P<quantity>.+)\)\s+.(?P<price>\d+\.\d+)

我没有任何错误,但是,如果我在 fields 的末尾添加它,我会收到以下 unhashable type 错误:

fields:
  ...
  friendly_name:
    parser: static
    value: Amazon

不可散列的类型错误:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/invoice2data", line 11, in <module>
    load_entry_point('invoice2data==0.3.5', 'console_scripts', 'invoice2data')()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 201, in main
    res = extract_data(f.name, templates=templates, input_module=input_module)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 93, in extract_data
    return t.extract(optimized_str)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/extract/invoice_template.py", line 174, in extract
    res_find = re.findall(v, optimized_str)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 181, in findall
    return _compile(pattern, flags).findall(string)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 237, in _compile
    p, loc = _cache[cachekey]
TypeError: unhashable type: 'OrderedDict'

有人可以帮我吗?我认为这是软件尝试解压选项但不知道如何解决时的错误。

这是我在调试模式下的内容:

...
DEBUG:invoice2data.extract.invoice_template:field=vat_lines | regexp=OrderedDict([('parser', 'lines'), ('start', 'PAYMENT TYPE\\s+AMOUNT\\s+_+'), ('end', '\\s_+\\s+PLEASE KEEP THIS RECEIPT SAFE'), ('line', '(?P<type_paiment>\\w+)\\s+.(?P<montant>\\d+\\.\\d+)'), ('types', OrderedDict([('montant', 'float')]))])

感谢您的帮助!

重现错误的最低限度:

script.py

import pprint
from invoice2data import extract_data
from invoice2data.extract.loader import read_templates

templates = read_templates('templates/')
result = extract_data("invoice.pdf", templates=templates)

pprint.pprint(result)

并将其作为模板(在模板文件夹中)

templates/fr/fr.error.yml

issuer: My Template
keywords:
- www.webok.com
- 123 4567 89
fields:
  amount: TOTAL\s+.(\d+\.\d+)
  date: Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
  invoice_number: Reference:\s(\w+)
  operator: Operators:\s(\w+)
  vat_lines:
    parser: lines
    start: PAYMENT TYPE\s+AMOUNT\s+_+
    end: \s_+\s+PLEASE KEEP THIS RECEIPT SAFE
    line: (?P<type_paiment>\w+)\s+.(?P<montant>\d+\.\d+)
    types:
      montant: float
options:
  currency: USD
  date_formats:
    - '%d/%m/%Y %G:%i'
  languages:
    - en
  decimal_separator: '.'
lines:
    start: Your Reference:+\s+\w+\n_+
    end: \s+_+\n+\s+TOTAL\s+.(\d+\.\d+)
    line: (?P<description>.+)\s+\((?P<quantity>.+)\)\s+.(?P<price>\d+\.\d+)

invoice.txt(需要转成.pdf)

__________________________________________
            My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product                (1)    €12.93
Second product               (3)    €22.93
Third product                (1)    €12.95
Last product                 (1)    €12.93
                              _________
                        TOTAL       €61.74

VAT/CODE         NET      VAT
_____________________________
20%   S       €93.27   €18.66

PAYMENT TYPE           AMOUNT
_____________________________
CASH                   €61.74
CARD                    €0.00

CHANGE GIVEN            €3.07

__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________


Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com

调试输出

最后,invoice2data调试输出的完整输出:

DEBUG:invoice2data.main:START pdftotext result ===========================
DEBUG:invoice2data.main:__________________________________________
             My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product                  (1)    €12.93
Second product                 (3)    €22.93
Third product                  (1)    €12.95
Last product                   (1)    €12.93
                                _________
                         TOTAL        €61.74

VAT/CODE         NET      VAT
_____________________________
20%   S       €93.27   €18.66

PAYMENT TYPE           AMOUNT
_____________________________
CASH                   €61.74
CARD                    €0.00

CHANGE GIVEN             €3.07

__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________



Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com


DEBUG:invoice2data.main:END pdftotext result =============================
DEBUG:invoice2data.main:Testing 254 template files
DEBUG:invoice2data.extract.invoice_template:Matched template fr.error.yml
DEBUG:invoice2data.extract.invoice_template:START optimized_str ========================
DEBUG:invoice2data.extract.invoice_template:__________________________________________
             My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product                  (1)    €12.93
Second product                 (3)    €22.93
Third product                  (1)    €12.95
Last product                   (1)    €12.93
                                _________
                         TOTAL        €61.74

VAT/CODE         NET      VAT
_____________________________
20%   S       €93.27   €18.66

PAYMENT TYPE           AMOUNT
_____________________________
CASH                   €61.74
CARD                    €0.00

CHANGE GIVEN             €3.07

__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________



Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com


DEBUG:invoice2data.extract.invoice_template:END optimized_str ==========================
DEBUG:invoice2data.extract.invoice_template:Date parsing: languages=['en'] date_formats=['%d/%m/%Y %G:%i']
DEBUG:invoice2data.extract.invoice_template:Float parsing: decimal separator=.
DEBUG:invoice2data.extract.invoice_template:keywords=['www.webok.com', '123 4567 89']
DEBUG:invoice2data.extract.invoice_template:{'date_formats': ['%d/%m/%Y %G:%i'], 'lowercase': False, 'decimal_separator': '.', 'currency': 'USD', 'replace': [], 'languages': ['en'], 'remove_whitespace': False, 'remove_accents': False}
DEBUG:invoice2data.extract.invoice_template:field=amount | regexp=TOTAL\s+.(\d+\.\d+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'61.74']
DEBUG:invoice2data.extract.invoice_template:field=date | regexp=Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
DEBUG:invoice2data.extract.invoice_template:res_find=[u'03/12/2020 11:23']
DEBUG:invoice2data.extract.invoice_template:result of date parsing=2020-03-12 11:23:00
DEBUG:invoice2data.extract.invoice_template:field=invoice_number | regexp=Reference:\s(\w+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'ABC123']
DEBUG:invoice2data.extract.invoice_template:field=operator | regexp=Operators:\s(\w+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'Me']
DEBUG:invoice2data.extract.invoice_template:field=vat_lines | regexp=OrderedDict([('parser', 'lines'), ('start', 'PAYMENT TYPE\\s+AMOUNT\\s+_+'), ('end', '\\s_+\\s+PLEASE KEEP THIS RECEIPT SAFE'), ('line', '(?P<type_paiment>\\w+)\\s+.(?P<montant>\\d+\\.\\d+)'), ('types', OrderedDict([('montant', 'float')]))])
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/invoice2data", line 11, in <module>
    load_entry_point('invoice2data==0.3.5', 'console_scripts', 'invoice2data')()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 201, in main
    res = extract_data(f.name, templates=templates, input_module=input_module)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 93, in extract_data
    return t.extract(optimized_str)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/extract/invoice_template.py", line 174, in extract
    res_find = re.findall(v, optimized_str)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 181, in findall
    return _compile(pattern, flags).findall(string)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 237, in _compile
    p, loc = _cache[cachekey]
TypeError: unhashable type: 'OrderedDict'

0 个答案:

没有答案