计算Python中不同子字符串的出现次数

时间:2018-02-21 02:43:00

标签: python distinct

我想用Python找到字符串中不同子字符串的出现。有什么建议吗?

数据集是

Fruit                     
apple|pear|grape|apple      
pear|pear|apple|apple       

我想要算上那种独特的水果。我想要的结果是:

Fruit                   Num_Fruit
------------------------------------
apple|pear|grape|apple       3
====================================
pear|pear|apple|apple        2

2 个答案:

答案 0 :(得分:0)

一个人去吧。

data = """
apple|pear|grape|apple
pear|pear|apple|apple
""".strip()

merged = [x.split("|") for x in data.split("\n")]

for row in merged:
  temp_val = ["|".join(row), str(len(set(row)))]
  print("|".join(temp_val))


apple|pear|grape|apple|3
pear|pear|apple|apple|2

答案 1 :(得分:0)

f=open("test.txt")   #has the dataset
lines=f.readlines()

for line in lines:
    unique=[]           #store unique fruits
    line=line.rstrip()  #remove "\n"
    line=line.replace(" ",'')  #remove extra spaces
    fruits=line.split('|')  #split by delimiter 
    print(line,len(set(fruits)))

一种方法是使用set数据结构。