首先,这是我到目前为止的代码,我稍后会给出解释:
ll1 = [
'A',
'B',
'C',
'D'
]
l2 = [
['A', 10],
['B', 20],
['D', 5],
['A', 15],
['B', 30],
['C', 10],
['D', 15]
]
dc = dict(l2)
l3 = [[k, dc.get(k, 0)] for k in l1]
结果如下:
['A', 15]
['B', 30]
['C', 10]
['D', 15]
第一个列表 l1 由固定数量的键组成,第二个列表 l2 具有第一个列表中给出的每个键的值。这里的 l2 只是一个例子,因为我稍后会得到这些值(这些值将作为列表给出)但它们将具有与 l1相同的键。需要显示每个键,可以重复键,但某些键可能具有空值(例如,项目C)。
但是当列表成为dict时,每个键的第一个值被丢弃,返回字典的唯一键。
如何才能使结果与下面的结果相似?
['A', 10]
['B', 20]
['C', 0]
['D', 5]
['A', 15]
['B', 30]
['C', 10]
['D', 15]
另一个例子是:
database_keys = [
'First Name',
'Last Name',
'Email',
'City'
]
database_input = [
['First Name', 'John'],
['Last Name', 'Doe'],
['Email', 'johndoe@test.com'],
['First Name', 'Jane'],
['Email', 'jane@test.com']
]
Output:
['First Name', 'John']
['Last Name', 'Doe']
['Email', 'johndoe@test.com']
['City', None]
['First Name', 'Jane']
['Last Name', None]
['Email', 'jane@test.com']
['City', None]
答案 0 :(得分:3)
我会使用生成器来填充缺失的值,只需保留cycle
个键,当下一个所需的键不是数据中的那个时只产生空值:
import itertools
def fill_the_blanks(data, keys):
keys = itertools.cycle(keys)
for name, value in data:
k = next(keys)
while name!=k:
yield [k,None]
k = next(keys)
yield [name,value]
>>> from pprint import pprint
>>> pprint( list(fill_the_blanks(l2, ll1)) )
[['A', 10],
['B', 20],
['C', None],
['D', 5],
['A', 15],
['B', 30],
['C', 10],
['D', 15]]
>>> pprint( list(fill_the_blanks(database_input,database_keys)) )
[['First Name', 'John'],
['Last Name', 'Doe'],
['Email', 'johndoe@test.com'],
['City', None],
['First Name', 'Jane'],
['Last Name', None],
['Email', 'jane@test.com']]
作为替代方案,如果您知道第一个键'First Name'
将始终标记条目的开头,为什么不使用dict.fromkeys
然后填写,直到您达到下一个“第一个值”:< / p>
def gen_dicts(data, keys):
first_key = keys[0]
entry = None #placeholder for first time
for name, value in data:
if name == first_key:
if entry is not None: #skip first time
yield entry
entry = dict.fromkeys(keys)
entry[name] = value
yield entry #last one
>>> from pprint import pprint
>>> pprint( list(gen_dicts(l2, ll1)) )
[{'A': 10, 'B': 20, 'C': None, 'D': 5}, {'A': 15, 'B': 30, 'C': 10, 'D': 15}]
>>> pprint( list(gen_dicts(database_input, database_keys)) )
[{'City': None,
'Email': 'johndoe@test.com',
'First Name': 'John',
'Last Name': 'Doe'},
{'City': None,
'Email': 'jane@test.com',
'First Name': 'Jane',
'Last Name': None}]
答案 1 :(得分:1)
这里有 脏 方式:
l1 = [
'A',
'B',
'C',
'D',
]
l2 = [
['A', 10],
['B', 20],
['D', 5],
['A', 15],
['B', 30],
['C', 10],
['D', 15],
['A', 8],
]
# Assuming elements in l2 are ordered, try to make groups
# of the same length of l1.
l_aux = l1[:]
l3 = [[]]
for x in l2:
if x[0] in l_aux:
l3[-1].append(x)
l_aux.remove(x[0])
continue
for y in l_aux:
l3[-1].append([y, 'WHATEVER'])
l3.append([x])
l_aux = l1[:]
l_aux.remove(x[0])
for y in l_aux:
l3[-1].append([y, 'WHATEVER'])
# Now, you have the elements you want grouped.
# Last step: sort and flat the list:
l3 = [y for x in l3 for y in sorted(x)]
print '\n'.join(str(x) for x in l3)
# ['A', 10]
# ['B', 20]
# ['C', 'WHATEVER']
# ['D', 5]
# ['A', 15]
# ['B', 30]
# ['C', 10]
# ['D', 15]
# ['A', 8]
# ['B', 'WHATEVER']
# ['C', 'WHATEVER']
# ['D', 'WHATEVER']
答案 2 :(得分:1)
这里的问题是字典如何存储值。字典将获取您的密钥,使用其上的__hash__
函数,然后存储该值。当涉及字符串时,具有相同值的两个字符串在__hash__
编辑时将具有相同的输出。例如
>>> a = "foo"
>>> b = "foo"
>>> a == b
True
>>> a.__hash__()
-905768032644956145
>>> b.__hash__()
-905768032644956145
如您所见,__hash__
时它们都具有相同的值。因此,当字典试图存储两个相同的键时,它将覆盖先前的值而不是创建新的键。
查看您的第一个和第二个示例,您可以改为使用词典列表(假设每个值都以"A"
或"First Name"
开头)。所以你可以这样做:
dc = []
for s in l2:
if s[0] != "First Name":
dc[-1][s[0]] = s[1]
else:
dc.append({s[0]: s[1]})
然后,要检索您从"First Name"
输入的第一个人的dc
,您可以使用此功能:
dc[0]["First Name"]
这个的扩展是将它们存储为类。假设我们有一个名为Person
的类:
class Person(object):
def __init__(self, personal_information):
super(Person, self).__init__()
self.first_name = personal_information["First Name"]
if "Last Name" in personal_information.keys():
self.last_name = personal_information["Last Name"]
if "Email" in personal_information.keys():
self.email = personal_information["Email"]
if "City" in personal_information.keys():
self.city = personal_information["City"]
def __repr__(self):
# Just to make things look clean
return "Person("+self.first_name+")"
这可以通过传递已存储在dc
中的字典来存储我们的所有数据:
people = []
for s in dc:
people.append(Person(s))
当您想要访问第一个人的名字时:
>>> people
[Person(John), Person(Jane)]
>>> people[0].first_name
'John'
数据结构的类型取决于您。