我想创建一个打印文本统计信息的列表:
我到目前为止的最后一次尝试,我没有工作。
f = open("input.txt", encoding="utf-8")
text = f.read()split()
words = []
one_l_words = []
two_l_words = []
for each in lines:
words += each.split(" ")
for each in words:
if len(each) == 1:
one_l_word.append(each)
for each in words:
if len(each) == 2:
two_l_word.append(each)
number_of_1lwords = len(one_l_words)
number_of_2lwords = len(two_l_words)
print(one_l_words)
print(two_l_words)
第一个问题是,我的代码不能正常工作,但无论如何我认为我的代码很复杂。因为我想计算从长度为1到长度为30的单词,它应该是一个简单的程序。
基本上它应该是这样的列表:
length | How often a word of this length occures
2 12415
答案 0 :(得分:1)
使用字典尝试以下内容:
f = open("airline.py")
words = f.read().split()
counts = {}
for i in words:
if len(i) not in counts:
counts[len(i)] = 1
else:
counts[len(i)]+=1
counts = sorted(counts.items(), key=lambda x:x[0]) #Converts to a list of tuples and sorts
print "length\t\tHow often a word of this length occurs"
for j in counts:
print str(j[0])+"\t\t"+str(j[1])
示例输出:
Length How often a word of this length occurs
1 21
2 7
3 32
4 4
5 11
6 11
7 5
8 13
9 8
10 14
11 10
12 5
13 12
14 9
15 5
17 3
18 6
19 1
20 1
21 3
22 1
27 1
答案 1 :(得分:0)
您可以使用以下词典:
dico = {}
for i in range(1 ,31): # just to init the dict and avoid checking if index exist...
dico[i] = 0
with open("input.txt", encoding="utf-8") as f: # better to use in that way
line = f.read()
for word in line.split(" "):
dico[len(word)] += 1
print(dico)
我希望它有所帮助,
答案 2 :(得分:0)
在这种情况下,collections.defaultdict(int)
非常合适:
import collections
def main():
counts = collections.defaultdict(int)
with open('input.txt', 'rt', encoding='utf-8') as file:
for word in file.read().split():
counts[len(word)] += 1
print('length | How often a word of this length occurs')
for i in sorted(counts.keys()):
print('%-6d | %d' % (i, counts[i]))
if __name__ == '__main__':
main()
答案 3 :(得分:-1)
#something like this might work
a = 'aAa@!@121'
d = {}
for i in a:
d[i]=d.get(i,0)+1
print(d)
#d has characters as key and value will be the count of character present in the string