使用重复值连接列表

时间:2016-06-05 01:33:08

标签: python list python-3.x

首先,这是我到目前为止的代码,我稍后会给出解释:

ll1 = [
'A',
'B',
'C',
'D'
]

l2 = [
['A', 10],
['B', 20],
['D', 5],
['A', 15],
['B', 30],
['C', 10],
['D', 15]
]

dc = dict(l2)
l3 = [[k, dc.get(k, 0)] for k in l1]

结果如下:

['A', 15]
['B', 30]
['C', 10]
['D', 15]

第一个列表 l1 由固定数量的键组成,第二个列表 l2 具有第一个列表中给出的每个键的值。这里的 l2 只是一个例子,因为我稍后会得到这些值(这些值将作为列表给出)但它们将具有与 l1相同的键。需要显示每个键,可以重复键,但某些键可能具有空值(例如,项目C)。

但是当列表成为dict时,每个键的第一个值被丢弃,返回字典的唯一键。

如何才能使结果与下面的结果相似?

['A', 10]
['B', 20]
['C', 0]
['D', 5]
['A', 15]
['B', 30]
['C', 10]
['D', 15]

另一个例子是:

database_keys = [
'First Name',
'Last Name',
'Email',
'City'
]
database_input = [
['First Name', 'John'],
['Last Name', 'Doe'],
['Email', 'johndoe@test.com'],
['First Name', 'Jane'],
['Email', 'jane@test.com']
]

Output:
['First Name', 'John']
['Last Name', 'Doe']
['Email', 'johndoe@test.com']
['City', None]
['First Name', 'Jane']
['Last Name', None]
['Email', 'jane@test.com']
['City', None]

3 个答案:

答案 0 :(得分:3)

我会使用生成器来填充缺失的值,只需保留cycle个键,当下一个所需的键不是数据中的那个时只产生空值:

import itertools
def fill_the_blanks(data, keys):
    keys = itertools.cycle(keys)
    for name, value in data:
        k = next(keys)
        while name!=k:
            yield [k,None]
            k = next(keys)
        yield [name,value]


>>> from pprint import pprint
>>> pprint( list(fill_the_blanks(l2, ll1)) )
[['A', 10],
 ['B', 20],
 ['C', None],
 ['D', 5],
 ['A', 15],
 ['B', 30],
 ['C', 10],
 ['D', 15]]
>>> pprint( list(fill_the_blanks(database_input,database_keys)) )
[['First Name', 'John'],
 ['Last Name', 'Doe'],
 ['Email', 'johndoe@test.com'],
 ['City', None],
 ['First Name', 'Jane'],
 ['Last Name', None],
 ['Email', 'jane@test.com']]

作为替代方案,如果您知道第一个键'First Name'将始终标记条目的开头,为什么不使用dict.fromkeys然后填写,直到您达到下一个“第一个值”:< / p>

def gen_dicts(data, keys):
    first_key = keys[0]
    entry = None #placeholder for first time
    for name, value in data:
        if name == first_key:
            if entry is not None: #skip first time
                yield entry
            entry = dict.fromkeys(keys)
        entry[name] = value
    yield entry #last one

>>> from pprint import pprint
>>> pprint( list(gen_dicts(l2, ll1)) )
[{'A': 10, 'B': 20, 'C': None, 'D': 5}, {'A': 15, 'B': 30, 'C': 10, 'D': 15}]
>>> pprint( list(gen_dicts(database_input, database_keys)) )
[{'City': None,
  'Email': 'johndoe@test.com',
  'First Name': 'John',
  'Last Name': 'Doe'},
 {'City': None,
  'Email': 'jane@test.com',
  'First Name': 'Jane',
  'Last Name': None}]

答案 1 :(得分:1)

这里有 方式:

l1 = [
'A',
'B',
'C',
'D',
]

l2 = [
['A', 10],
['B', 20],
['D', 5],

['A', 15],
['B', 30],
['C', 10],
['D', 15],

['A', 8],
]

# Assuming elements in l2 are ordered, try to make groups
# of the same length of l1.
l_aux = l1[:]
l3 = [[]]
for x in l2:
    if x[0] in l_aux:
        l3[-1].append(x)
        l_aux.remove(x[0])
        continue
    for y in l_aux:
        l3[-1].append([y, 'WHATEVER'])
    l3.append([x])
    l_aux = l1[:]
    l_aux.remove(x[0])
for y in l_aux:
    l3[-1].append([y, 'WHATEVER'])
# Now, you have the elements you want grouped.
# Last step: sort and flat the list:
l3 = [y for x in l3 for y in sorted(x)]
print '\n'.join(str(x) for x in l3)
# ['A', 10]
# ['B', 20]
# ['C', 'WHATEVER']
# ['D', 5]
# ['A', 15]
# ['B', 30]
# ['C', 10]
# ['D', 15]
# ['A', 8]
# ['B', 'WHATEVER']
# ['C', 'WHATEVER']
# ['D', 'WHATEVER']

答案 2 :(得分:1)

这里的问题是字典如何存储值。字典将获取您的密钥,使用其上的__hash__函数,然后存储该值。当涉及字符串时,具有相同值的两个字符串在__hash__编辑时将具有相同的输出。例如

>>> a = "foo"
>>> b = "foo"
>>> a == b
True
>>> a.__hash__()
-905768032644956145
>>> b.__hash__()
-905768032644956145

如您所见,__hash__时它们都具有相同的值。因此,当字典试图存储两个相同的键时,它将覆盖先前的值而不是创建新的键。

查看您的第一个和第二个示例,您可以改为使用词典列表(假设每个值都以"A""First Name"开头)。所以你可以这样做:

dc = []
for s in l2:
    if s[0] != "First Name":
        dc[-1][s[0]] = s[1]
    else:
        dc.append({s[0]: s[1]})

然后,要检索您从"First Name"输入的第一个人的dc,您可以使用此功能:

dc[0]["First Name"]

这个的扩展是将它们存储为类。假设我们有一个名为Person的类:

class Person(object):
    def __init__(self, personal_information):
        super(Person, self).__init__()
        self.first_name = personal_information["First Name"]
        if "Last Name" in personal_information.keys():
            self.last_name = personal_information["Last Name"]
        if "Email" in personal_information.keys():
            self.email = personal_information["Email"]
        if "City" in personal_information.keys():
            self.city = personal_information["City"]
    def __repr__(self):
        # Just to make things look clean
        return "Person("+self.first_name+")"

这可以通过传递已存储在dc中的字典来存储我们的所有数据:

people = []

for s in dc:
    people.append(Person(s))

当您想要访问第一个人的名字时:

>>> people
[Person(John), Person(Jane)]
>>> people[0].first_name
'John'

数据结构的类型取决于您。