我有一个字符串列表(来自.tt
文件),如下所示:
list1 = ['have\tVERB', 'and\tCONJ', ..., 'tree\tNOUN', 'go\tVERB']
我想把它变成一个看起来像的字典:
dict1 = { 'have':'VERB', 'and':'CONJ', 'tree':'NOUN', 'go':'VERB' }
我在考虑替代,但它并没有那么好用。有没有办法将标签字符串'\t'
标记为分隔符?
答案 0 :(得分:16)
尝试以下方法:
dict1 = dict(item.split('\t') for item in list1)
<强>输出:强>
>>>dict1
{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}
答案 1 :(得分:7)
由于str.split
也默认情况下在'\t'
上展开('\t'
被视为空格),您可以通过提供{{3}来获得功能性方法dict
看起来非常优雅:
d = dict(map(str.split, list1))
字典d
现在处于想要的形式:
print(d)
{'and': 'CONJ', 'go': 'VERB', 'have': 'VERB', 'tree': 'NOUN'}
如果您只需要在 '\t'
上进行拆分(忽略' '
和'\n'
)并仍想使用map
方法,您可以使用map
创建仅使用'\t'
作为分隔符的部分对象:
from functools import partial
# only splits on '\t' ignoring new-lines, white space e.t.c
tabsplit = partial(str.split, sep='\t')
d = dict(map(tabsplit, list1))
这当然会使用字符串的示例列表为d
产生相同的结果。
答案 2 :(得分:4)
使用简单的字典理解和str.split
(没有参数strip
在空白处拆分)
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
dict1 = {x.split()[0]:x.split()[1] for x in list1}
结果:
{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}
编辑:x.split()[0]:x.split()[1]
执行split
两次,这不是最佳选择。如果没有字面理解,这里的其他答案会更好。
答案 3 :(得分:3)
解决问题的一种简短方法,因为默认情况下拆分方法拆分'\t'
(正如Jim Fasarakis-Hilliard所指出的那样)可能是:
dictionary = dict(item.split() for item in list1)
print dictionary
我还写了一个更简单和经典的方法。
对于初学者来说,不是非常pythonic但容易理解:
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
dictionary1 = {}
for item in list1:
splitted_item = item.split('\t')
word = splitted_item[0]
word_type = splitted_item[1]
dictionary1[word] = word_type
print dictionary1
在这里,我用相当详细的评论编写了相同的代码:
# Let's start with our word list, we'll call it 'list1'
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
# Here's an empty dictionary, 'dictionary1'
dictionary1 = {}
# Let's start to iterate using variable 'item' through 'list1'
for item in list1:
# Here I split item in two parts, passing the '\t' character
# to the split function and put the resulting list of two elements
# into 'splitted_item' variable.
# If you want to know more about split function check the link available
# at the end of this answer
splitted_item = item.split('\t')
# Just to make code more readable here I now put 1st part
# of the splitted item (part 0 because we start counting
# from number 0) in "word" variable
word = splitted_item[0]
# I use the same apporach to save the 2nd part of the
# splitted item into 'word_type' variable
# Yes, you're right: we use 1 because we start counting from 0
word_type = splitted_item[1]
# Finally I add to 'dictionary1', 'word' key with a value of 'word_type'
dictionary1[word] = word_type
# After the for loop has been completed I print the now
# complete dictionary1 to check if result is correct
print dictionary1
有用的链接: