Question

编写一个程序来创建一个一致性文件-一个索引，该索引告诉您每个单词出现在文件的哪一行。调用函数concord，并接受输入文件名作为参数。将输出写入名为concord.txt的文件。如果一个单词在多行上，则一致性将向您显示包含该单词的所有行。提示：使用每个单词作为关键字的字典来解决此问题，例如

输入文件包含：

I went to the restaurant yesterday.  Hello, I said, to the man who
greeted me at the door.  Where is your restroom?  On my way to the
restroom, I bumped into the waiter boy.  Excuse me, sir, I said.
When I returned to the table, the meal was served.  These are the
best clams I have ever eaten, I said.  My compliments to the chef.
Unfortunately, I was arrested by the officer for not paying my bill.

产生：

clams  [5]
   is  [2]
 chef  [5]
 ever  [5]
   at  [2]
 have  [5]
table  [4]
 your  [2]
 best  [5]
  sir  [3]
 said  [1, 3, 5]
  for  [6]
  boy  [3]
 when  [4]
   by  [6]
   to  [1, 2, 4, 5]
  way  [2]
  was  [4, 6]
  ...

L08-8）（5分）与上面相同的代码，但是打印时应该对一致的单词进行排序。

到目前为止，我已经知道了这一点，但这给了我单词出现的次数而不是行号。

Python代码：

def main():
   """
       Main function
   """
   try:
       # Opening file for reading
       fp = open("d:\Python\input.txt");

       # Dictionary to hold words and their frequencies
       wordDict = {};

       # Reading data line by line
       for line in fp:
           # Splitting words on space
           words = line.split(" ");

           # Looping over each word in words
           for word in words:

               # Considering only the words with length at-least 1
               if len(word) > 0:
                   # Converting to lower case
                   word = word.lower();

                   # Checking for existence of key in dict
                   if word in wordDict.keys():
                       # If already present, just update frequency
                       wordDict[word] += 1;
                   else:
                       # If new word, updating existing value
                       wordDict[word] = 1;

       # Closing file
       fp.close();

       # Looping over sorted keys of dictionary
       for key in sorted(wordDict):
           # Printing word frequency values
           print(" {0} : {1} ".format(key, wordDict[key]));

   except Exception as ex:
       # Handling exceptions
       print(ex);

# Calling main function      
main();

Answer 1

尝试创建一个字典，其中键是单词，值是单词所在的行：

s = """I went to the restaurant yesterday. Hello, I said, to the man who
greeted me at the door. Where is your restroom? On my way to the
restroom, I bumped into the waiter boy. Excuse me, sir, I said.
When I returned to the table, the meal was served. These are the
best clams I have ever eaten, I said. My compliments to the chef.
Unfortunately, I was arrested by the officer for not paying my bill."""

def main():
    words = {}
    lines = [i.lower() for i in s.split("\n")]
    for line in lines:
        for word in line.split():
            w = word.strip(" ,.!?")
            words[w] = [i for i, l in enumerate(lines, start=1) if w in l]

    for w, n in words.items():
        print(w, n)
main()

Answer 2

因为单线很酷。

s = """I went to the restaurant yesterday. Hello, I said, to the man who
greeted me at the door. Where is your restroom? On my way to the
restroom, I bumped into the waiter boy. Excuse me, sir, I said.
When I returned to the table, the meal was served. These are the
best clams I have ever eaten, I said. My compliments to the chef.
Unfortunately, I was arrested by the officer for not paying my bill."""

words = {word.lower().strip(' .,!?'): [l_n+1 for l_n, l in enumerate(s.split('\n')) if word.lower().strip(' .,!?') in [w.lower().strip(' .,!?') for w in l.split()]] for line in s.split('\n') for word in line.split()}

出于可读性考虑：

words = {
    word.lower().strip(' .,!?'): [l_n+1 for l_n, l in enumerate(s.split('\n')) 
                                  if word.lower().strip(' .,!?') in [w.lower().strip(' .,!?') 
                                                                     for w in l.split()]]
    for line in s.split('\n')
    for word in line.split()
}

与行号一致

2 个答案: