我已经编写了这段代码,以便在我的文本文件中出现超过50次的艺术家对。两个艺术家的名字用逗号分隔。我无法弄清楚为什么我的程序运行不正常,因为我正确地提取名称,但是打印名称时会失真,这些名称中的某些字符会消失。
这是我的代码:
#Program Written in Jetbrains Pycharm Community Edition with `Anaconda(Python Version 3.5)`
#Dhruv Marwha
filename="Artist_lists_small.txt" #Name of the file assigned to a variable
with open(filename,encoding='utf8')as infile:
something=list(set(infile.read().split(","))) #Initialising Something with all unique artist names found in the Text File based on splitting all the words on a ','.
dictionary={}#Dictionary to Store Artist Name Along with It's Line Number
with open(filename,encoding='utf8') as f:#If the word from our list is present in the text file,we add the word+it's line number to the list of values in the dictionary.
for line_num,xyz in enumerate(f):
for i in range(len(something)):
if something[i] in xyz:
dictionary.setdefault(something[i], []).append(line_num)
for key,value in list(dictionary.items()):#Removing All the keys whose list of values has a length of less than 50
#This is done because if 2 words occur less than 50 times,there's no point in making their combinations.
if len(value)<50:
del dictionary[key]
pair_list=[]#List to Store the Collection of All pairs which occur together more than 50 times
for key,value in dictionary.items():
for key1,value1 in dictionary.items():#Comparing one key with another key in the same dictionary in order to find the intersection between the list of values
if key==key1:
continue
if len(set.intersection(set(value),set(value1)))>50:#If the values between 2 lists intersect more than 50 times,the pair is valid.
pair_list.append((key,key1))
for k in range(len(pair_list)):#Printing The List Of Pairs.
print(pair_list[k])
#print(len(pair_list))#Printing the no of pairs which occur more than 50 times.
#############################################################################################################
#Time Complexity-O(n^2)
#Space Complexity-O(n)
Here's the link到相关文本文件。
另外请告诉我这个程序是否产生正确的输出,因为它产生正确的输出。