不使用矩阵共同出现

时间:2017-11-19 11:25:47

标签: python python-3.x

我试图在不使用矩阵的情况下解决共现问题。所以我接受一个文本文件并小写所有字符,删除标点符号,拆分成一个列表。然后对于第一行中的每个单词,我希望该单词在字典中,单词为键,行号为值。然后对于第二行我希望单词是键,第二行是值等等。我无法弄清楚如何将值递增到行号。我试过做一个for循环失败了。我无法弄清楚如何制作一个while循环累加器,因为我不确定我究竟能比较x<,>等等。 len(ofdict)不起作用,len(列表)不起作用;我不确定我需要什么。我被困了7个小时。

import re
import string
string.punctuation


def open_file(f):
    with open(input("Enter a file name: "),'r' ) as inf:

  #
    while 1:
        f = inf.readline()
        f = f.lower()
        f = re.sub(r'\b\w{1,2}\b', '', f)
        f = "".join(i for i in f if i not in string.punctuation)
        print_list([f])

        if not f: 
            break
  #

def print_list(f):
    f = list(filter(None, f))
    for k in range(0, len(f)):
    l = (f[k])
    l = l.strip('  ')
    l = l.strip()
    read_data(l)
#

def read_data(l):
    l = [l]
    l = [x for x in l if x != '']
    d = list(filter(None, l))
    print_list2(d)

def print_list2(f):
    l = 0  
for k in range(0, len(f)):

    my_dictionary = f[k]
    word = my_dictionary.split()

    this_dic = {word[k] : {l} for k in range(0, len(word))}
    print(this_dic)












r = open_file(input("Press Any Key To Begin: You Will Enter A File Name 
During\The Next Prompt "))

1 个答案:

答案 0 :(得分:0)

这是我找到的解决方案。基本上我试图在我应该的时候循环过去。我们的想法是在使用readline方法的同时循环。然后将结果作为参数传递给后面的函数,直到需要设置字典的值。

import re
import string
string.punctuation


def open_file(f):
    with open(input("Enter a file name: "),'r' ) as inf:


  #
    lines = 0
    while 1:
        lines += 1
        f = inf.readline()
        f = f.lower()
        f = re.sub(r'\b\w{1,2}\b', '', f)
        f = "".join(i for i in f if i not in string.punctuation)

        print_list([f], lines)

        if not f: 
            break
  #

def print_list(f, lines):
    f = list(filter(None, f))
    for k in range(0, len(f)):
        l = (f[k])
        l = l.strip('  ')
        l = l.strip()
        read_data(l, lines)
#

def read_data(l, lines):
    l = [l]
    l = [x for x in l if x != '']
    d = list(filter(None, l))
    print_list2(d, lines)

def print_list2(f, lines):

    for k in range(0, len(f)):

        my_dictionary = f[k]
        word = my_dictionary.split()

        this_dic = {word[k] : {lines} for k in range(0, len(word))}
        print(this_dic)












r = open_file(input("Press Any Key To Begin: You Will Enter A File Name During\
                The Next Prompt "))