说我有以下单词我想列入一个列表
"cat,dog,fish" (first row)
"turtle,charzard,pikachu,lame" (second row)
"232.34,23.4,242.12%" (third row)
我的问题是我如何计算每一行中的标记,如第一行有3,第二行有4,第三行有3.之后如何计算字符数,然后为每一行决定哪个标记有大多数人物?所以输出看起来像
token count = 3, character count = 10, fish has the most characters
token count = 4, character count = 25, charzard has the most characters
token count = 3, character count = 17, 242.12% has the most characters
仅使用像len()这样的简单列表方法。并使用逗号作为分隔符。谢谢,我真的输了,因为每次我尝试使用strip(',')删除逗号我都会收到错误
答案 0 :(得分:4)
试试这个。适用于Python2
和Python3
rows = [ "cat,dog,fish", "turtle,charzard,pikachu,lame", "232.34,23.4,242.12%" ]
for row in rows:
tokens = row.split(',')
token_cnt = len(tokens)
char_cnt = sum([len(token) for token in tokens])
longest_token = max(tokens, key=len)
print("token count = %d, character count = %d, %s has the most characters" %(token_cnt, char_cnt, longest_token))
结果:
>>> token count = 3, character count = 10, fish has the most characters
>>> token count = 4, character count = 25, charzard has the most characters
>>> token count = 3, character count = 17, 242.12% has the most characters
<强>编辑:强>
现在使用max
而非我愚蠢的sort
选择来找到最长的单词,受@ inspectorG4dget的回答启发。
答案 1 :(得分:1)
给出一个字符串列表:
def my_output(string_of_tokens):
tokens = string_of_tokens.split(",")
print "token count = %s, character count = %s, %s has the most characters" %
(len(tokens), sum(map(len, tokens)), reduce(lambda a, b: a if len(a) > len(b) else b, tokens))
list = ["cat,dog,fish", "turtle,charzard,pikachu,lame", "232.34,23.4,242.12%"]
for l in list:
my_output(l)
答案 2 :(得分:1)
假设您有一个逗号分隔的行文件:
with open('path/to/input') as infile:
for i,line in enumerate(infile, 1):
toks = line.split(',')
print "row %d: token_count=%d character_count=%d '%s' has the most characters" %(len(toks), sum(len(t) for t in toks), max(toks, key=len))
答案 3 :(得分:-2)