我在查看SO上的职位发布时发现了这个编程问题。我认为它非常有趣,作为一名初学Python程序员,我试图解决它。但是我觉得我的解决方案非常......凌乱......任何人都可以提出任何建议来优化它或使其更清洁吗?我知道这很简单,但我写得很开心。注意:Python 2.6
问题:
为接受字符串的函数编写伪代码(或实际代码),并返回该字符串中出现次数最多的字母。
我的尝试:
import string
def find_max_letter_count(word):
alphabet = string.ascii_lowercase
dictionary = {}
for letters in alphabet:
dictionary[letters] = 0
for letters in word:
dictionary[letters] += 1
dictionary = sorted(dictionary.items(),
reverse=True,
key=lambda x: x[1])
for position in range(0, 26):
print dictionary[position]
if position != len(dictionary) - 1:
if dictionary[position + 1][1] < dictionary[position][1]:
break
find_max_letter_count("helloworld")
输出:
>>>
('l', 3)
更新示例:
find_max_letter_count("balloon")
>>>
('l', 2)
('o', 2)
答案 0 :(得分:21)
有很多方法可以做到这一点。例如,您可以使用Counter
类(在Python 2.7或更高版本中):
import collections
s = "helloworld"
print(collections.Counter(s).most_common(1)[0])
如果你没有,你可以手动进行计数(2.5或更高版本有defaultdict
):
d = collections.defaultdict(int)
for c in s:
d[c] += 1
print(sorted(d.items(), key=lambda x: x[1], reverse=True)[0])
话虽如此,你的实施并没有太严重的错误。
答案 1 :(得分:4)
如果您使用的是Python 2.7,则可以使用集合模块快速完成此操作。 集合是一种高性能数据结构模块。了解更多信息 http://docs.python.org/library/collections.html#counter-objects
>>> from collections import Counter
>>> x = Counter("balloon")
>>> x
Counter({'o': 2, 'a': 1, 'b': 1, 'l': 2, 'n': 1})
>>> x['o']
2
答案 2 :(得分:2)
以下是使用字典找到最常见字符的方法
message = "hello world"
d = {}
letters = set(message)
for l in letters:
d[message.count(l)] = l
print d[d.keys()[-1]], d.keys()[-1]
答案 3 :(得分:1)
如果您希望所有具有最大计数数字的字符,那么您可以对目前提出的两个提议之一进行修改:
import heapq # Helps finding the n largest counts
import collections
def find_max_counts(sequence):
"""
Returns an iterator that produces the (element, count)s with the
highest number of occurrences in the given sequence.
In addition, the elements are sorted.
"""
if len(sequence) == 0:
raise StopIteration
counter = collections.defaultdict(int)
for elmt in sequence:
counter[elmt] += 1
counts_heap = [
(-count, elmt) # The largest elmt counts are the smallest elmts
for (elmt, count) in counter.iteritems()]
heapq.heapify(counts_heap)
highest_count = counts_heap[0][0]
while True:
try:
(opp_count, elmt) = heapq.heappop(counts_heap)
except IndexError:
raise StopIteration
if opp_count != highest_count:
raise StopIteration
yield (elmt, -opp_count)
for (letter, count) in find_max_counts('balloon'):
print (letter, count)
for (word, count) in find_max_counts(['he', 'lkj', 'he', 'll', 'll']):
print (word, count)
这会产生,例如:
lebigot@weinberg /tmp % python count.py
('l', 2)
('o', 2)
('he', 2)
('ll', 2)
这适用于任何序列:单词,但也有['hello','hello','bonjour'],例如。
heapq
结构非常有效地查找序列的最小元素而无需完全排序。另一方面,由于字母表中的字母数量不是很多,您可能还会查看已排序的计数列表,直到找不到最大计数为止,这样就不会造成任何严重的速度损失。
答案 4 :(得分:1)
问题: 字符串中最常见的字符 输入字符串中出现的最大字符
方法1:
a = "GiniGinaProtijayi"
d ={}
chh = ''
max = 0
for ch in a : d[ch] = d.get(ch,0) +1
for val in sorted(d.items(),reverse=True , key = lambda ch : ch[1]):
chh = ch
max = d.get(ch)
print(chh)
print(max)
方法2:
a = "GiniGinaProtijayi"
max = 0
chh = ''
count = [0] * 256
for ch in a : count[ord(ch)] += 1
for ch in a :
if(count[ord(ch)] > max):
max = count[ord(ch)]
chh = ch
print(chh)
方法3:
import collections
a = "GiniGinaProtijayi"
aa = collections.Counter(a).most_common(1)[0]
print(aa)
答案 5 :(得分:1)
这是使用FOR LOOP和COUNT()的一种方式
w = input()
r = 1
for i in w:
p = w.count(i)
if p > r:
r = p
s = i
print(s)
答案 6 :(得分:0)
以下是我要做的一些事情:
collections.defaultdict
代替手动初始化的dict
。max
,而不是自己动手 - 这更容易。这是我的最终结果:
from collections import defaultdict
def find_max_letter_count(word):
matches = defaultdict(int) # makes the default value 0
for char in word:
matches[char] += 1
return max(matches.iteritems(), key=lambda x: x[1])
find_max_letter_count('helloworld') == ('l', 3)
答案 7 :(得分:0)
def most_frequent(text):
frequencies = [(c, text.count(c)) for c in set(text)]
return max(frequencies, key=lambda x: x[1])[0]
s = 'ABBCCCDDDD'
print(most_frequent(s))
frequencies
是一个元组列表,将字符计为(character, count)
。我们使用count
将max应用于元组并返回该元组的character
。如果出现平局,此解决方案将只选择一个。
答案 8 :(得分:0)
我注意到,即使最常用的字符数量相等,大多数答案也只返回一项。例如“ iii 444 yyy 999”。有相等数量的空格,即i,4,y和9。解决方案应该返回所有内容,而不仅仅是字母i:
sentence = "iii 444 yyy 999"
# Returns the first items value in the list of tuples (i.e) the largest number
# from Counter().most_common()
largest_count: int = Counter(sentence).most_common()[0][1]
# If the tuples value is equal to the largest value, append it to the list
most_common_list: list = [(x, y)
for x, y in Counter(sentence).items() if y == largest_count]
print(most_common_count)
# RETURNS
[('i', 3), (' ', 3), ('4', 3), ('y', 3), ('9', 3)]
答案 9 :(得分:0)
我的做法没有使用 Python 本身的内置函数,只使用 for 循环和 if 语句。
def most_common_letter():
string = str(input())
letters = set(string)
if " " in letters: # If you want to count spaces too, ignore this if-statement
letters.remove(" ")
max_count = 0
freq_letter = []
for letter in letters:
count = 0
for char in string:
if char == letter:
count += 1
if count == max_count:
max_count = count
freq_letter.append(letter)
if count > max_count:
max_count = count
freq_letter.clear()
freq_letter.append(letter)
return freq_letter, max_count
这可确保您获得最常使用的每个字母/字符,而不仅仅是一个。它还返回它发生的频率。希望这会有所帮助:)
答案 10 :(得分:0)
如果您因任何原因不能使用集合,我会建议以下实现:
s = input()
d = {}
# We iterate through a string and if we find the element, that
# is already in the dict, than we are just incrementing its counter.
for ch in s:
if ch in d:
d[ch] += 1
else:
d[ch] = 1
# If there is a case, that we are given empty string, then we just
# print a message, which says about it.
print(max(d, key=d.get, default='Empty string was given.'))
答案 11 :(得分:-1)
#file:filename
#quant:no of frequent words you want
def frequent_letters(file,quant):
file = open(file)
file = file.read()
cnt = Counter
op = cnt(file).most_common(quant)
return op