“test.txt”
中有两个sentnecssentence1 =句子是由一个或多个单词组成的语法单位。
sentence2 =句子也可以单独用正字法定义。
count_line = 0
for line in open('C:/Users/Desktop/test.txt'):
count_line = count_line +1
fields = line.rstrip('\n').split('\t')
##print count_line, fields
file = open('C:/Users/Desktop/test_words.txt', 'w+')
count_word = 0
for words in fields:
wordsplit = words.split()
for word in wordsplit:
count_word = count_word + 1
print count_word, word
file.write(str(count_word) + " " + word + '\n')
file.close()
我在“test_words.txt”中的结果只显示了第二句中的单词:
1 A
2 sentence
3 can
4 also
5 be
6 defined
7 in
8 orthographic
9 terms
10 alone.
如何编写第一句中的单词并在第二句“test_words.txt”中的单词后跟?
有什么建议吗?
答案 0 :(得分:3)
在您的代码中,您多次打开和关闭输出文件,导致代码覆盖您从第一句中写入的内容。 简单的解决方案是只打开一次并且只关闭一次。
count_line = 0
# Open outside the loop
file = open('C:/Users/Desktop/test_words.txt', 'w+')
for line in open('C:/Users/Desktop/test.txt'):
count_line = count_line +1
fields = line.rstrip('\n').split('\t')
##print count_line, fields
count_word = 0
for words in fields:
wordsplit = words.split()
for word in wordsplit:
count_word = count_word + 1
print count_word, word
file.write(str(count_word) + " " + word + '\n')
# Also close outside the loop
file.close()
答案 1 :(得分:0)
发生这种情况的原因是因为当您第二次打开文件时,您不会保留其中的原始文本。当你打开一个文件并用Python写入它时,你基本上会覆盖它的内容,除非你将它们存储在变量中并重新编写它们。
试试这段代码:
count_line = 0
for n, line in enumerate(open('test.txt')):
count_line = count_line +1
fields = line.rstrip('\n').split('\t')
##print count_line, fields
already_text = open('test_words.txt').read() if n > 0 else ''
file = open('test_words.txt', 'w+')
count_word = 0
file.write(already_text)
for words in fields:
wordsplit = words.split()
for word in wordsplit:
count_word = count_word + 1
print count_word, word
file.write(str(count_word) + " " + word + '\n')
file.close()
这是我运行时的输出:
1 A 2 sentence 3 is 4 a 5 grammatical 6 unit 7 consisting 8 of 9 one 10 or 11 more 12 words. 1 A 2 sentence 3 can 4 also 5 be 6 defined 7 in 8 orthographic 9 terms 10 alone.
这是没有enumerate()
的代码:
count_line = 0
n = 0
for line in open('test.txt'):
count_line = count_line +1
fields = line.rstrip('\n').split('\t')
##print count_line, fields
already_text = open('test_words.txt').read() if n > 0 else ''
file = open('test_words.txt', 'w+')
count_word = 0
file.write(already_text)
for words in fields:
wordsplit = words.split()
for word in wordsplit:
count_word = count_word + 1
print count_word, word
file.write(str(count_word) + " " + word + '\n')
file.close()
n += 1
答案 2 :(得分:0)
如果可能,在处理文件时应该使用with
- 它是一个上下文管理器,并确保在完成它们后它们被正确关闭(通过留下缩进的块表示)。这里我们使用enumerate
和提供的可选start
参数 - 这是一种方式(少数几种)在计数器移动到下一行时保持计数器的运行:
# Open the file
with open('test.txt', 'rb') as f:
# Open the output (in Python 2.7+, this can be done on the same line)
with open('text_words.txt', 'wb') as o:
# Set our counter
counter = 1
# Iterate through the file
for line in f:
# Strip out newlines and split on whitespace
words = line.strip().split()
# Start our enumeration, which will return the index (starting at 1) and
# the word itself
for index, word in enumerate(words, counter):
# Write the word to the file
o.write('{0} {1}\n'.format(index, word))
# Increment the counter
counter += len(words)
或者如果您想要更少的行 - 这会使用readlines()
将文件读入包含由换行符分隔的项目的列表中。然后,线条本身在空白上分开,每个单词都被拉出。这意味着您基本上遍历文件中所有单词的列表,并与enumerate
结合使用,您不需要为计数器增加计数器:
# Open the file
with open('test.txt', 'rb') as f:
# Open the output (in Python 2.7+, this can be done on the same line)
with open('text_words.txt', 'wb') as o:
# Iterate through the file
for i, w in enumerate((x for l in f.readlines() for x in l.strip().split()), 1):
o.write('{0} {1}\n'.format(i, w))
使用Python 2.7:
# Open the file
with open('test.txt', 'rb') as f, open('text_words.txt', 'wb') as o:
# Iterate through the file
for i, w in enumerate((x for l in f.readlines() for x in l.strip().split()), 1):
o.write('{0} {1}\n'.format(i, w))
答案 3 :(得分:0)
这可能无关紧要,但我建议你用更干净的方法来写。你不需要有3个循环:
lines = open('test.txt').readlines()
file = open('test_words.txt', 'w+')
for line in lines:
words = line.rstrip('\n').split()
for i, word in enumerate(words):
print i, word
file.write('%d %s\n' % (i+1, word))
file.close()