如何从字符串中删除重复的行,然后打印已删除的行数?
我明白了:
import os
sentence = """Sentence1
Sentence1
Sentence2
Sentence3
Sentence4
Sentence4"""
spaces = sentence.replace(" ", "\n") #Makes one word per line
lines = os.linesep.join([s for s in spaces.splitlines() if s]) #Removes empty lines
duplicate = "\n".join(set(lines.split('\n'))) #Removes duplicate lines
numberlines = len(duplicate.split('\n')) #Counts lines
print(duplicate)
print'Lines:', numberlines
这样,输出为:
Sentence4
Sentence1
Sentence2
Sentence3
Lines: 4
如何实现此输出:
Sentence4
Sentence1
Sentence2
Sentence3
Lines: 4
Removed Lines: 2
谢谢:D
答案 0 :(得分:1)
您可以使用set
:
Removed_lines = len(lines.split("\n")) - len(set(lines.split("\n")))
答案 1 :(得分:1)
让我们逐行分析您的代码:
spaces = sentence.replace(" ", "\n") #Makes one word per line
到目前为止,非常好。
lines = os.linesep.join([s for s in spaces.splitlines() if s]) #Removes empty lines
好的,所以你删除空行,但最好将结果保留为列表,而不是将它们粘合到一个字符串中,因为...:
duplicate = "\n".join(set(lines.split('\n'))) #Removes duplicate lines
...在这里你再次拆分它,再次将结果加入一个字符串......
numberlines = len(duplicate.split('\n')) #Counts lines
...只是再分开一次。更好的版本:
spaces = sentence.split() # Makes one word per line
lines = [s for s in spaces if s] # Removes empty lines
duplicate = set(lines) # Removes duplicate lines
numberlines = len(duplicate) # Counts lines
removed_lines = len(lines) - numberlines
print '\n'.join(duplicate)
print 'Lines:', numberlines
print 'Removed:', removed_lines
答案 2 :(得分:0)
import os
sentence = """Sentence1
Sentence1
Sentence2
Sentence3
Sentence4
Sentence4"""
spaces = sentence.replace(" ", "\n")
lines = os.linesep.join([s for s in spaces.splitlines() if s])
duplicate = "\n".join(set(lines.split('\n')))
numberlinesprev = len(sentence.split('\n'))
num1 = int(numberlinesprev)
numberlines = len(duplicate.split('\n'))
num2 = int(numberlines)
sum = num1 - num2
print(duplicate)
print'Lines Removed:', sum
print'Lines:', numberlines
输出:
Sentence4
Sentence1
Sentence2
Sentence3
Lines Removed: 2
Lines: 4