我的文字包含以下字符串:
{whatever}:::duplicateString:::{whatever}
{whatever}:::duplicateString:::{whatever}
....
{whatever}:::duplicateString:::{whatever}
{whatever}:::duplicateString:::{whatever}
如何从文本中删除 duplicateString :主要想法是,如果它出现的次数超过一次,则从行中删除第二个单词。
第一个想法是逐行读取它们并按“ ::: ”拆分,以便创建数组并通过向TreeSet添加条目来迭代数组。好。但是如何再次粘合线?
我不记得任何机制来弄清楚这样的任务。语言没关系,只是解决方案?
示例文字:
Appliances:::Main
Appliances:::Main:::Appliance Warranties
Appliances:::Main:::Beer Keg Refrigerators
Appliances:::Main:::Beverage Refrigerators
Appliances:::Main:::Ceiling Fans & Accessories
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Downrod Couplers
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Downrods
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Fan Replacement Blades
理想情况下,它必须像:
Appliances:::Main
Appliances:::Appliance Warranties
Appliances:::Beer Keg Refrigerators
Appliances:::Beverage Refrigerators
Appliances:::Ceiling Fans & Accessories
Appliances:::Ceiling Fans & Accessories:::Accessories
Appliances:::Ceiling Fans & Accessories:::Accessories:::Downrod Couplers
Appliances:::Ceiling Fans & Accessories:::Accessories:::Downrods
Appliances:::Ceiling Fans & Accessories:::Accessories:::Fan Replacement Blades
答案 0 :(得分:1)
如果duplicateString可能只作为第二个单词出现,你可以(在Python中):
lastWord = None
for line in open('file.txt'):
w = line.split(':::')
thisWord = w[1]
if lastWord==w[1]:
del w[1]
lastWord = thisWord
print ':::'.join(w)