使用Python在字符串数据中分割错误

时间:2019-02-28 12:22:02

标签: python regex split

嗨,我想分割以下字符串。但是我在spliitng中遇到了错误

import string

description = "ABC:PUNE COLLEGE XYZ:SATARA COLLEGE TTC ACCT KOREGAON SATARA PQR: MUMBAI TTC ACCT NUMBER 45767"

tag_list = ["ABC:", "XYZ:", "TTC ACCT", "PQR:", "TTC ACCT NUMBER"]

for each_tag in tag_list:
    if each_tag[-1] is not ":":
        description = description.replace(each_tag, each_tag + ":")
print(description)

tag_list_formatted = ["ABC:", "XYZ:", "TTC ACCT:", "PQR:", "TTC ACCT NUMBER:"]

for each_new_tag in tag_list_formatted:
    description = description.replace(each_new_tag, "|" + each_new_tag)

print(description)

请检查以下预期输出

Expected Output :-
"|ABC:PUNE COLLEGE |XYZ:SATARA COLLEGE |TTC ACCT: KOREGAON SATARA |PQR: MUMBAI |TTC ACCT NUMBER: 45767"

请检查以下错误输出

Error Output :-
"|ABC:PUNE COLLEGE |XYZ:SATARA COLLEGE |TTC ACCT: KOREGAON SATARA |PQR: MUMBAI |TTC ACCT: NUMBER 45767"

如何使用python解决以上错误? 请检查错误输出中的|TTC ACCT: NUMBER。但是我想|TTC ACCT NUMBER:输出。

1 个答案:

答案 0 :(得分:0)

这是类似str.replace(old, new[, max])的python替换

-这是要替换的旧子字符串。

-这是新的子字符串,它将替换旧的子字符串。

max -如果给出此可选参数max,则仅替换第一个出现的计数。

在您的情况下,您有这个标签TTC ACCT,而当您替换而没有指定TTC ACCT NUMBER时,这个标签max的第一个TTC ACCT将是TTC ACCT:,然后是{ {1}}将是TTC ACCT NUMBER,这就是为什么您看到TTC ACCT : NUMBER

使这一行:

TTC ACCT: NUMBER 45767

像这样:

description = description.replace(each_tag, each_tag + ":")

获取以下代码:

description = description.replace(each_tag, each_tag + ":" ,1)

输出:

import string

description = "ABC:PUNE COLLEGE XYZ:SATARA COLLEGE TTC ACCT KOREGAON SATARA PQR: MUMBAI TTC ACCT NUMBER 45767"

tag_list = ["ABC:", "XYZ:", "TTC ACCT", "PQR:", "TTC ACCT NUMBER"]

for each_tag in tag_list:
   if each_tag[-1] is not ":":
       description = description.replace(each_tag, each_tag + ":" ,1)

print(description)

tag_list_formatted = ["ABC:", "XYZ:", "TTC ACCT:", "PQR:", "TTC ACCT NUMBER:"]
for each_new_tag in tag_list_formatted:
    description = description.replace(each_new_tag, "|" + each_new_tag)

print(description)