我有这个列表,由标签和权重字符串组成:
lst = ['rock 101071', 'pop 69159', 'alternative 55777', 'indie 48175',
'electronic 46270', 'female vocalists 42565', 'favorites 39921',
'Love 34901', 'dance 33618', '00s 31432']
我试图将其转换为元组,如:
[('rock ', '101071'), ('pop ', '69159'), ('alternative ', '55777'), ('indie ', '48175'),
('electronic ', '46270'), ('female vocalists ', '42565'), ('favorites ', '39921'),
('Love ', '34901'), ('dance ', '33618'), ('s ', '0031432')]
这里,每个字符串被拆分为元组,使得每个元素的索引0包含除最后一个单词之外的单词,索引1处的元素包含该字符串的最后一个单词。
为实现这一目标,我的代码是:
tags=[]
weights=[]
for i in lst:
tag = ''.join([x for x in i if not x.isdigit()])
tags.append(tag)
weight = ''.join([x for x in i if x.isdigit()])
weights.append(weight)
然后,如果我这样做:
print zip(tags, weights)
我得到了理想的结果。但不幸的是,有些标签本身包含数字,例如00's
中的lst
。
如何正确格式化('00s ', '0031432')
?
PS:作为一种替代分割方法,i.split("")
并不理想,因为集合中的某些标签有很多单词。
答案 0 :(得分:6)
您可以使用str.rsplit()
根据空格将字符串拆分为>>> lst = ['rock 101071', 'pop 69159', 'alternative 55777', 'indie 48175', 'electronic 46270', 'female vocalists 42565', 'favorites 39921', 'Love 34901', 'dance 33618', '00s 31432']
>>> [s.rsplit(' ', 1) for s in lst]
[['rock', '101071'], ['pop', '69159'], ['alternative', '55777'], ['indie', '48175'], ['electronic', '46270'], ['female vocalists', '42565'], ['favorites', '39921'], ['Love', '34901'], ['dance', '33618'], ['00s', '31432']]
为1.例如:
[tuple(s.rsplit(' ', 1)) for s in lst]
但这将是嵌套列表的列表(我认为应该没问题)。但是如果必须在问题中提到嵌套元组,则可以将值类型转换为元组:
if alive?(board, row, col) && live_count == 2 || live_count == 3
board[row][col] = 3
答案 1 :(得分:1)
>>> lst = ['rock 101071', 'pop 69159', 'alternative 55777', 'indie 48175',
... 'electronic 46270', 'female vocalists 42565', 'favorites 39921',
... 'Love 34901', 'dance 33618', '00s 31432']
>>>
>>> [tuple(s.rsplit(maxsplit=1)) for s in lst]
[('rock', '101071'), ('pop', '69159'), ('alternative', '55777'), ('indie', '48175'), ('electronic', '46270'), ('female vocalists', '42565'), ('favorites', '39921'), ('Love', '34901'), ('dance', '33618'), ('00s', '31432')]
答案 2 :(得分:0)
即使对于包含多个单词的标签,这也适用:
def processdata(lst):
raw_tuples = [i.split() for i in lst]
sani_tuples = [(' '.join(i[:-1]), i[-1]) for i in raw_tuples]
return sani_tuples
if __name__ == '__main__':
lst = ['rock 101071', 'pop 69159', 'alternative 55777', 'test multi word tag 101020']
print(processdata(lst))
输出:
[('rock', '101071'), ('pop', '69159'), ('alternative', '55777'), ('test multi word tag', '101020')]