def count_spaces(filename):
input_file = open(filename,'r')
file_contents = input_file.read()
space = 0
tabs = 0
newline = 0
for line in file_contents == " ":
space +=1
return space
for line in file_contents == '\t':
tabs += 1
return tabs
for line in file_contents == '\n':
newline += 1
return newline
input_file.close()
我正在尝试编写一个函数,该函数将文件名作为参数,并返回文件中所有空格,换行符和制表符的总数。我想尝试使用一个基本的for循环和if语句,但我现在正在努力:/任何帮助都会非常感谢。
答案 0 :(得分:0)
C=Counter(open(afile).read())
C[' ']
答案 1 :(得分:0)
您当前的代码无效,因为您在单个混乱语句中将循环语法(for x in y
)与条件测试(x == y
)结合在一起。你需要将它们分开。
您还需要使用一个return
语句,否则您到达的第一个语句将停止运行该函数,而其他值将永远不会返回。
尝试:
for character in file_contents:
if character == " ":
space +=1
elif character == '\t':
tabs += 1
elif character == '\n':
newline += 1
return space, tabs, newline
Joran Beasley的答案中的代码是一种更加Pythonic的方法来解决这个问题。您可以使用the collections.Counter
class计算文件中所有字符的出现次数,而不是为每种字符设置单独的条件,只需在末尾提取空白字符的计数。 Counter
就像字典一样。
from collections import Counter
def count_spaces(filename):
with open(filename) as in_f:
text = in_f.read()
count = Counter(text)
return count[" "], count["\t"], count["\n"]
答案 2 :(得分:0)
要支持大文件,您可以一次读取固定数量的字节:
#!/usr/bin/env python
from collections import namedtuple
Count = namedtuple('Count', 'nspaces ntabs nnewlines')
def count_spaces(filename, chunk_size=1 << 13):
"""Count number of spaces, tabs, and newlines in the file."""
nspaces = ntabs = nnewlines = 0
# assume ascii-based encoding and b'\n' newline
with open(filename, 'rb') as file:
chunk = file.read(chunk_size)
while chunk:
nspaces += chunk.count(b' ')
ntabs += chunk.count(b'\t')
nnewlines += chunk.count(b'\n')
chunk = file.read(chunk_size)
return Count(nspaces, ntabs, nnewlines)
if __name__ == "__main__":
print(count_spaces(__file__))
Count(nspaces=150, ntabs=0, nnewlines=20)
mmap
允许您将文件视为字节串而不将整个文件实际加载到内存中,例如,您可以在其中搜索正则表达式模式:
#!/usr/bin/env python3
import mmap
import re
from collections import Counter, namedtuple
Count = namedtuple('Count', 'nspaces ntabs nnewlines')
def count_spaces(filename, chunk_size=1 << 13):
"""Count number of spaces, tabs, and newlines in the file."""
nspaces = ntabs = nnewlines = 0
# assume ascii-based encoding and b'\n' newline
with open(filename, 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
c = Counter(m.group() for m in re.finditer(br'[ \t\n]', s))
return Count(c[b' '], c[b'\t'], c[b'\n'])
if __name__ == "__main__":
print(count_spaces(__file__))
Count(nspaces=107, ntabs=0, nnewlines=18)
答案 3 :(得分:0)
在我的情况下,制表符(\ t)转换为“”(四个空格)。所以我修改了 逻辑有点照顾。
def count_spaces(filename):
with open(filename,"r") as f1:
contents=f1.readlines()
total_tab=0
total_space=0
for line in contents:
total_tab += line.count(" ")
total_tab += line.count("\t")
total_space += line.count(" ")
print("Space count = ",total_space)
print("Tab count = ",total_tab)
print("New line count = ",len(contents))
return total_space,total_tab,len(contents)