将列表作为键和值的元组更新,将列表转换为带有集合的值

时间:2017-11-21 04:17:56

标签: python python-3.x list dictionary set

我已经实施了许多潜在的解决方案而没有成功。所以我想我错过了一些东西。我试图从字典中获取值并使用set(x)但不起作用。我试图使用defaultdict也没有运气。还有其他一些选择。我相信我实施错误,但我不确定为什么。有什么建议?这是我的代码:

def open_file(f): 
    with open(input("Enter a file name: "),'r' ) as inf:

    #while True:

    #try:
     #   filename=input("Enter a filename: ").strip()
      #  fr=open(filename,"r")
       # break

    #except IOError:
     #   print("File "+filename+" does not exist. Please reenter")


  #
    lines = 0 
    while 1:
        lines += 1
        f = inf.readline()
        f = f.lower()
        f = re.sub(r'\b\w{1,2}\b', '', f)
        f = "".join(i for i in f if i not in string.punctuation)
        #print(lines, f)
        #print(type(lines))
        print_list([f], lines)

        if not f: 
            break
  #

def print_list(f, lines):
    f = list(filter(None, f))
    for k in range(0, len(f)):
    l = (f[k])
    l = l.strip('  ')
    l = l.strip()
    #print(l, lines )
    #print(type(lines))
    read_data(l, lines)
#

def read_data(l, lines):
    l = [l]
    l = [x for x in l if x]
    d = list(filter(None, l))
    print_list2(d, lines)

def print_list2(f, lines):
    global this_dic

    for k in range(0, len(f)):

    my_dictionary = f[k]
    word = my_dictionary.split()

    word = tuple(word)

""""here is were the dictionary is created which is a regular dictionary 
for the keys and the values are a list.  I want to change the list values 
into set values so that duplicates are removed.  I am reading a file line 
by line and then creating a dictionary were the words appear and the line 
numbers they appear on.  Later I will being asking for user input to 
search so if they enter "the" I will say the word "the" exists on line 
numbers 1, 3, 4, etc in this file""""

    for i in range(0, len(word)):
        if word[i] not in this_dic:
            this_dic[word[i]] = [lines]
        else:
            this_dic[word[i]].append(lines)
以下是我发现这样做的正确方法。我将上述代码留作了一种方式,以显示我开始使用的内容与最终找到的解决方案之间的区别。
  for i in range(0, len(word)):

        if word[i] not in this_dic:
           this_dic[word[i]] = set()
        this_dic[word[i]].add(lines)


 r = open_file(input("Press Any Key To Begin: You Will Enter A File Name During\
                The Next Prompt "))
def find_cooccurence(inp_str, D):
    spit_str = inp_str.split()
    d = set(tuple(D.values()))

    print(spit_str, d)

1 个答案:

答案 0 :(得分:0)

更新: 这段代码将打开一个文件,读取它,创建一个字典,其中键是行号,值是由空格分隔的单词集。

with open(fpath, 'r') as readf_obj:
    line_words = {line_num+1: set(line.split()) for line_num, line in enumerate(readf_obj)}
print(line_words)

这是你要找的?

原件:
你可以set(a_dict.values()),你不必先将其转换为元组。

那就是说,你的问题并没有说明你想做什么,并且发布所有代码而没有任何评论或线索关注哪个部分也没有用。< / p>