获取列表中所有字符串的第一个单词

时间:2015-01-09 16:12:45

标签: python string list split strip

我有一个CSV文件,我正在阅读如下。我需要得到所有字符串的第一个字。我知道如何得到第一封信,但我不确定我怎么能得到文字。

['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']

我希望我的输出

diffuse
back
public
forearm

4 个答案:

答案 0 :(得分:3)

您可以使用列表推导和split()功能:

>>> l=['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
>>> [i.split()[0] for i in l]
['diffuse', 'back', 'public']

答案 1 :(得分:1)

你可以使用理解

>>> l = [['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
,['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']]

>>> list({i.split()[0] for j in l for i in j})
['back', 'diffuse', 'forearm', 'public']

答案 2 :(得分:0)

l = [
    ['diffuse systemic sclerosis', 'back', 'public on july 15 2008'],
    ['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']
    ]
d = lambda o: [a.split().pop(0) for a in o]
r = lambda a,b: d(a) + d(b)
print "\n".join(set(reduce(r, l)))
>>> 
public
forearm
diffuse
back

答案 3 :(得分:0)

您可以在列表理解中使用str.split,请注意,您可以指定maxsplit来减少操作数量:

L = ['diffuse systemic sclerosis', 'back', 'public on july 15 2008']

res = [i.split(maxsplit=1)[0] for i in L]
# ['diffuse', 'back', 'public']

您还可以在功能上执行相同的操作:

from operator import itemgetter, methodcaller

splitter = methodcaller('split', maxsplit=1)
res = list(map(itemgetter(0), map(splitter, L)))

在多个列表中,如果希望保持观察唯一单词的顺序,可以使用itertool unique_everseen recipe库中的more_itertools

from itertools import chain
from more_itertool import unique_everseen

L1 = ['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
L2 = ['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']

res = list(unique_everseen(i.split(maxsplit=1)[0] for i in chain(L1, L2)))

# ['diffuse', 'back', 'public', 'forearm']