Question

我正在阅读包含文本的文件，然后将其传递以提取名词短语。名词短语正在打印，但是当我将其写入文本文件时，只有第一个短语被写入，或者什么也没有被写入。下面是我编写的用于打印到文本文件的代码。

下面的代码

import nltk
import re

file = open("C:\datafiles\entytest.txt", "r")
doclist = [ line for line in file ]
docstr = '' . join(doclist)
sentences = re.split(r'[.!?]', docstr)


grammar = '\n'.join([
  'NP: {<DT>*<NN>*<NN>}',
 ])

for sentence in sentences:
    words = nltk.word_tokenize(sentence)
    tags = nltk.pos_tag(words)
    chunkparser = nltk.RegexpParser(grammar)
    nnphrs = chunkparser.parse(tags)
    print(nnphrs)

f = open("C:\datafiles\nphrs.txt", "w")
for sentence in sentences:
    f.write("'%s',\n" %nnphrs)
f.close()

Answer 1

如果您希望将单词放入txt文件中，则应将其存储在循环中，如下所示：

f = open("C:\datafiles\nphrs.txt", "w")

for sentence in sentences:
    words = nltk.word_tokenize(sentence)
    tags = nltk.pos_tag(words)
    chunkparser = nltk.RegexpParser(grammar)
    nnphrs = chunkparser.parse(tags)
    f.write("'%s',\n" %nnphrs)
    print(nnphrs)
f.close()

Answer 2

正如Khelwood所说，由于缩进效果较差，您只解析了一个句子。

与许多其他语言不同，Python是根据行间距执行的。当块比循环更缩进时，它们就是循环和构造的一部分。

您可以详细了解here。

f = open("C:\datafiles\nphrs.txt", "w")    
for sentence in sentences:
   words = nltk.word_tokenize(sentence)
   tags = nltk.pos_tag(words)
   chunkparser = nltk.RegexpParser(grammar)
   nnphrs = chunkparser.parse(tags)
   print(nnphrs)
   f.write("'%s',\n" %nnphrs)  
f.close()

Answer 3

我将使用print来写入文件：

with open("C:\datafiles\nphrs.txt", "w") as f:

    for sentence in sentences:
        words = nltk.word_tokenize(sentence)
        tags = nltk.pos_tag(words)
        chunkparser = nltk.RegexpParser(grammar)
        nnphrs = chunkparser.parse(tags)

        print(nnphrs,file=f)

将结果写入python中的文本文件

3 个答案: