我需要翻译一些公司法律表格:
ABC GMBH CO & KG
DEF LIMITED LIABILITY CO
XYZ AD
UVW LTEE
这个想法是GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO
我编写了以下代码,但它似乎不起作用。有什么想法吗?
file = open("fake.txt","r").read()
col = file.split("\n")
abbr = ['LLC', 'GMBH']
full = [
('LIMITED LIABILITY COMPANY', 'LIMITED LIABILITY CO', 'LTEE', 'LIMITEE','AD', 'AKTZIONERNO DRUZHESTVO'),
('GMBH CO & KG', 'MBH', 'GESELLSCHAFT MIT BESCHRANKTER HAFTUNG')
]
def trans(col):
i=0
while i<len(abbr):
c=0
while c<len(full[i]):
for x in full[i][c]:
if x in col:
col = col.replace(x,abbr[i])
c+=1
i+=1
return col
print trans(col)
答案 0 :(得分:1)
您可以创建一个包含所有字符串的字典,这些字符串与键的缩写形式相同,并将该缩写作为值。然后,您需要迭代输入行以查找字符串。
这就是我的意思:
>>> lines = ["ABC GMBH CO & KG",
... "DEF LIMITED LIABILITY CO",
... "XYZ AD",
... "UVW LTEE"]
>>> abbr_dict = {}
>>> abbr_dict['GMBH CO & KG'] = 'GMBH'
>>> abbr_dict['MBH'] = 'GMBH'
>>> abbr_dict['GESELLSCHAFT MIT BESCHRANKTER HAFTUNG'] = 'GMBH'
>>> abbr_dict['LIMITED LIABILITY COMPANY'] = 'LLC'
>>> abbr_dict['LIMITED LIABILITY CO'] = 'LLC'
>>> abbr_dict['LTEE'] = 'LLC'
>>> abbr_dict['LIMITEE'] = 'LLC'
>>> abbr_dict['AD'] = 'LLC'
>>> abbr_dict['AKTZIONERNO DRUZHESTVO'] = 'LLC'
>>> for line in lines:
... for key in abbr_dict:
... if key in line:
... line = line.replace(key, abbr_dict[key])
... print(line)
... break # This is to prevent multiple replacements on the same line
打印:
ABC GMBH
DEF LLC
XYZ LLC
UVW LLC
请注意,如果输入行包含ABC GMBH AD & KG
之类的字符串,则这可能不是最佳解决方案。在这种情况下,它会将MBH
替换为GMBH
,而ABC GMBH LLC & KG
可能不是您需要的{{1}}。
答案 1 :(得分:0)
您的代码中存在两个问题:
for x in full[i][c]:
此内容将查看每个full[i][c]
的每个字符,而不是full[i]
的每个元素。
if x in col:
修复第一个问题后,这将尝试与行的内容完全匹配,而不是子字符串。