我必须找到“a ..,z”,“A,..,Z”,“space”,“。”的标志。和某些数据中的“,”。
我试过了代码:
fh = codecs.open("mydata.txt", encoding = "utf-8")
text = fh.read()
fh1 = unicode(text)
dic_freq_signs = dict(Counter(fh1.split()))
All_freq_signs = dic_freq_signs.items()
List_signs = dic_freq_signs.keys()
List_freq_signs = dic_freq_signs.values()
但它给我的所有迹象都不是我要找的那些? 有人可以帮忙吗?
(它必须是unicode)
答案 0 :(得分:0)
检查字典迭代..
All_freq_signs = [ item for item in dic_freq_signs.items() if item.something == "somevalue"]
def criteria(value):
return value%2 == 0
All_freq_signs = [ item for item in dic_freq_signs.items() if criteria(item)]
答案 1 :(得分:0)
确保导入字符串模块,使用它可以轻松获得字符范围a to z
和A to Z
import string
Counter(any_string)
给出字符串中每个字符的计数。通过使用split()
,计数器将返回字符串中每个单词的计数,与您的要求相矛盾。所以我假设你需要字符数。
dic_all_chars = dict(Counter(fh1)) # this gives counts of all characters in the string
signs = string.lowercase + string.uppercase + ' .,' # these are the characters you want to check
# using dict comprehension and checking if the key is in the characters you want
dic_freq_signs = {key: value for key, value in dic_all_chars.items()
if key in signs}
dic_freq_signs
只会出现您想要计算为关键字及其作为值的计数的迹象。