我想用Python找到字符串中不同子字符串的出现。有什么建议吗?
数据集是
Fruit
apple|pear|grape|apple
pear|pear|apple|apple
我想要算上那种独特的水果。我想要的结果是:
Fruit Num_Fruit
------------------------------------
apple|pear|grape|apple 3
====================================
pear|pear|apple|apple 2
答案 0 :(得分:0)
一个人去吧。
data = """
apple|pear|grape|apple
pear|pear|apple|apple
""".strip()
merged = [x.split("|") for x in data.split("\n")]
for row in merged:
temp_val = ["|".join(row), str(len(set(row)))]
print("|".join(temp_val))
apple|pear|grape|apple|3
pear|pear|apple|apple|2
答案 1 :(得分:0)
f=open("test.txt") #has the dataset
lines=f.readlines()
for line in lines:
unique=[] #store unique fruits
line=line.rstrip() #remove "\n"
line=line.replace(" ",'') #remove extra spaces
fruits=line.split('|') #split by delimiter
print(line,len(set(fruits)))
一种方法是使用set数据结构。