Question

假设我们有一个图书对象列表，每个图书对象都有一个标题和一个作者ID：

books = [
    { 'title': 'book1', 'author_id': 'author1' },
    { 'title': 'book2', 'author_id': 'author2' },
    { 'title': 'book3', 'author_id': 'author1' }
]

我们如何有效地将此列表转换为作者对象列表，其中包含包含该作者所有书籍的books属性？即，将该列表转换为此列表：

authors = [
    { 'author_id': 'author1', 'books': [{ 'title': 'book1' }, { 'title': 'book3' }],
    { 'author_id': 'author2', 'books': [{ 'title': 'book2' }]
]

以下是我对解决方案的尝试，虽然它似乎效率低下且错综复杂：

authors = []

for book in books:
    # Index of the author's object if it has already been added to the array
    existing_author_indices = [i for i in range(len(authors)) if authors[i]['author_id'] == book['author_id']]

    # The author is already in authors, so add the book to its books
    if len(existing_author_indices) > 0:
        authors[existing_author_indices[0]]['books'].append(book)
    # Add the author to authors with this book as the only one yet
    else:
        author = { 'author_id': book['author_id'], 'books': [book] }

我们非常感谢任何建议。

Answer 1

使用itertools.groupby，您可以执行以下操作：

key = lambda d: d['author_id']

authors = [
    {'author_id': k, 'books': [{'title': d['title']} for d in g]}
    for k, g in groupby(sorted(books, key=key), key=key)
]

这会按author_id（k）对书籍的排序进行排序和分组，并累积每个组的书名（g）。

不过，如果不丢失信息，下面的结构会不会简单得多：

authors = {
    k: [d['title'] for d in g]
    for k, g in groupby(sorted(books, key=key), key=key)
}

# {
#     'author1': ['book1', 'book3'],
#     'author2': ['book2']
# }

Answer 2

您可以使用defaultdict生成字典，其中键是作者姓名，值是每位作者的书籍列表。一旦你有了它，很容易转换为列表：

from collections import defaultdict

books = [
    { 'title': 'book1', 'author_id': 'author1' },
    { 'title': 'book2', 'author_id': 'author2' },
    { 'title': 'book3', 'author_id': 'author1' }
]

d = defaultdict(list)
for book in books:
    d[book['author_id']].append({'title': book['title']})

[{'author_id': k, 'books': v} for k, v in d.items()] # [{'author_id': 'author1', 'books': [{'title': 'book1'}, {'title': 'book3'}]}, {'author_id': 'author2', 'books': [{'title': 'book2'}]}]

这会导致 O（n）时间复杂度，因为它不需要排序。

Answer 3

我会建议（编辑你认为合适的方式，我只做标题）

{'author1': ['book1', 'book3'], 'author2': ['book2']}

你可以这样得到它

authors = dict()
for book in books:
    author_id = book['author_id']
    if author_id not in authors:
        authors[author_id] = list()
    author_books = authors[author_id]
    book_title = book['title']
    if book_title not in author_books:
        author_books.append(book_title)

Answer 4

这对我有用，只是在dict中收集作者并最终返回构造的列表：

def trans(books):
    authors = {}
    for bk in books:
        if bk['author_id'] not in authors:
            authors[bk['author_id']] = [{'title': bk['title']}]
        else:
            authors[bk['author_id']].append({'title': bk['title']})

    return [{'author_id': k, 'books': authors[k]} for k in authors]

Answer 5

这对我有用。没有使用地图的多个循环

authors_map = {}
authors = []
for index, book in enumerate(books):
    if book['author_id'] in authors_map:
        authors[authors_map[book['author_id']]][
            'books'].append({'title': book['title']})
    else:
        authors_map[book['author_id']] = len(authors)
        authors.append({'author_id': book['author_id'], 'books': [
                       {'title': book['title']}]})

将Python对象列表转换为一对多对象列表

5 个答案: