我遇到了一些代码问题。我试图在文件中找到重复的单词,例如“the”,然后打印它发生的行。到目前为止,我的代码适用于行计数,但是给了我在整个文件中重复的所有单词,而不仅仅是那些正在重复的单词。
我需要更改哪些内容才会计算加倍的字数?
my_file = input("Enter file name: ")
lst = []
count = 1
with open(my_file, "r") as dup:
for line in dup:
linedata = line.split()
for word in linedata:
if word not in lst:
lst.append(word)
else:
print("Found word: {""} on line {}".format(word, count))
count = count + 1
dup.close()
答案 0 :(得分:1)
my_file = input("Enter file name: ")
with open(my_file, "r") as dup:
for line_num, line in enumerate(dup):
words_in_line = line.split()
duplicates = [word for i, word in enumerate(words_in_line[1:]) if words_in_line[i] == word]
# now you have a list of duplicated words in line in duplicates
# do whatever you want with it
答案 1 :(得分:0)
将下面的代码放在名为THISfile.py的文件中并执行它以查看它的作用:
# myFile = input("Enter file name: ")
# line No 2: line with with double 'with'
# line No 3: double ( word , word ) is not a double word
myFile="THISfile.py"
lstUniqueWords = []
noOfFoundWordDoubles = 0
totalNoOfWords = 0
lineNo = 0
lstLineNumbersWithWordDoubles = []
with open(myFile, "r") as myFile:
for line in myFile:
lineNo+=1 # memorize current line number
lineWords = line.split()
if len(lineWords) > 0: # scan line only if it contains words
currWord = lineWords[0] # remember already 'visited' word
totalNoOfWords += 1
if currWord not in lstUniqueWords:
lstUniqueWords.append(currWord)
# put 'visited' word word into lstAllWordsINmyFile (if it is not already there)
lastWord = currWord # we are done with current, so current becomes last one
if len(lineWords) > 1 : # proceed only if line has two or more words
for word in lineWords[1:] : # loop over all other words
totalNoOfWords += 1
currWord = word
if currWord not in lstUniqueWords:
lstUniqueWords.append(currWord)
# put 'visited' word into lstAllWordsINmyFile (if it is not already there)
if( currWord == lastWord ): # duplicate word found:
noOfFoundWordDoubles += 1
print("Found double word: ['{""}'] in line {}".format(currWord, lineNo))
lstLineNumbersWithWordDoubles.append(lineNo)
lastWord = currWord
# ^--- now after all all work is done, the currWord is considered lastWord
print(
"noOfDoubles", noOfFoundWordDoubles, "\n",
"totalNoOfWords", totalNoOfWords, "uniqueWords", len(lstUniqueWords), "\n",
"linesWithDoubles", lstLineNumbersWithWordDoubles
)
输出应为:
Found double word: ['with'] in line 2
Found double word: ['word'] in line 19
Found double word: ['all'] in line 33
noOfDoubles 3
totalNoOfWords 221 uniqueWords 111
linesWithDoubles [2, 19, 33]
现在,您可以查看代码中的注释,以便更好地了解其工作原理。
答案 2 :(得分:0)
这里只提出问题的纯粹答案:
"我需要更改哪些内容才能计算加倍的字数?"
你在这里:
wmic /node:brspd030 computersystem get caption >>\\brspd010\c$\users\machael1\desktop\gpresult.txt & psexec \\brspd030 gpresult -r | findstr /i "WSUS" >>\\brspd010\c$\users\machael1\desktop\gpresult.txt