我已经实施了许多潜在的解决方案而没有成功。所以我想我错过了一些东西。我试图从字典中获取值并使用set(x)
但不起作用。我试图使用defaultdict
也没有运气。还有其他一些选择。我相信我实施错误,但我不确定为什么。有什么建议?这是我的代码:
def open_file(f):
with open(input("Enter a file name: "),'r' ) as inf:
#while True:
#try:
# filename=input("Enter a filename: ").strip()
# fr=open(filename,"r")
# break
#except IOError:
# print("File "+filename+" does not exist. Please reenter")
#
lines = 0
while 1:
lines += 1
f = inf.readline()
f = f.lower()
f = re.sub(r'\b\w{1,2}\b', '', f)
f = "".join(i for i in f if i not in string.punctuation)
#print(lines, f)
#print(type(lines))
print_list([f], lines)
if not f:
break
#
def print_list(f, lines):
f = list(filter(None, f))
for k in range(0, len(f)):
l = (f[k])
l = l.strip(' ')
l = l.strip()
#print(l, lines )
#print(type(lines))
read_data(l, lines)
#
def read_data(l, lines):
l = [l]
l = [x for x in l if x]
d = list(filter(None, l))
print_list2(d, lines)
def print_list2(f, lines):
global this_dic
for k in range(0, len(f)):
my_dictionary = f[k]
word = my_dictionary.split()
word = tuple(word)
""""here is were the dictionary is created which is a regular dictionary
for the keys and the values are a list. I want to change the list values
into set values so that duplicates are removed. I am reading a file line
by line and then creating a dictionary were the words appear and the line
numbers they appear on. Later I will being asking for user input to
search so if they enter "the" I will say the word "the" exists on line
numbers 1, 3, 4, etc in this file""""
for i in range(0, len(word)):
if word[i] not in this_dic:
this_dic[word[i]] = [lines]
else:
this_dic[word[i]].append(lines)
以下是我发现这样做的正确方法。我将上述代码留作了一种方式,以显示我开始使用的内容与最终找到的解决方案之间的区别。
for i in range(0, len(word)):
if word[i] not in this_dic:
this_dic[word[i]] = set()
this_dic[word[i]].add(lines)
r = open_file(input("Press Any Key To Begin: You Will Enter A File Name During\
The Next Prompt "))
def find_cooccurence(inp_str, D):
spit_str = inp_str.split()
d = set(tuple(D.values()))
print(spit_str, d)
答案 0 :(得分:0)
更新: 这段代码将打开一个文件,读取它,创建一个字典,其中键是行号,值是由空格分隔的单词集。
with open(fpath, 'r') as readf_obj:
line_words = {line_num+1: set(line.split()) for line_num, line in enumerate(readf_obj)}
print(line_words)
这是你要找的?
原件:
你可以set(a_dict.values())
,你不必先将其转换为元组。
那就是说,你的问题并没有说明你想做什么,并且发布所有代码而没有任何评论或线索关注哪个部分也没有用。< / p>