我创建了一个这样的列表:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),...]
我想将其转换为这样的字典:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2008-12-20','reason':'sold'},
25: {'Start': '2009-01-01', 'End': '2009-11-14','reason':'returned'},
26: {'Start': '2010-04-03', 'End': '2010-10-11','reason':'sold'},...}
对于字典中的每个键,它是Book列表中元组的第一个值(它是一个代码),我希望有两个元组作为每个键的值。其中一个与“开始”有关。点和另一个相关的'结束'该特定代码的要点。
我还有另一个问题。 对于某些代码,有一个以上的结尾'点不同的日期。我想只保留较晚日期的结束点。这样的事情:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'),
(24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
对于上面的例子,字典应该保留这个:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2009-11-25','reason':'sold'},
有人可以帮我吗?
答案 0 :(得分:1)
您可以使用itertools.groupby
,min
和max
:
import itertools
def quantity_key(d):
return list(map(int, d[1].split('-')))
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'), (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
输出:
{24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'}, 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'}, 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'sold'}}
每个键有两个以上的值:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'), (24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
输出:
{24: {'Start': '2008-10-30', 'End': '2009-11-25', 'reason': 'sold'}}
答案 1 :(得分:0)
这是一个满足这两个标准的解决方案。
每当它为新书ID添加时,它会为其创建dict
并在遇到list
中的数据时将其填入。
对于多个 End 条目,您的日期格式允许使用字符串比较来获取最新日期。
books = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),
(26, '2011-10-11', 'End', 'returned')] # The latest 'End' entry should be picked
bookDict = {}
for info in books:
id_ = info[0]
type_ = info[2]
book = bookDict.setdefault(id_, {})
if type_ == 'Start':
book[type_] = info[1]
elif type_ == 'End' and info[1] > book.get(type_, ''):
book[type_] = info[1]
book['reason'] = info[3]
输出:
bookDict
# {24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'},
# 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'},
# 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'returned'}}
答案 2 :(得分:0)
你可以这样做:
for t in Book:
index, date, marker, *rest = t
entry = d.setdefault(index, {})
end_date = entry.get("End", "1900-01-01")
if marker == "Start" or date > end_date:
entry[marker] = date
if rest:
entry["reason"] = rest[0]
答案 3 :(得分:0)
这只回答了OP问题的第一部分,虽然它可以适用于第二部分。
您可以将collections.defaultdict
用于O(n)解决方案:
book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
from collections import defaultdict
d = defaultdict(dict)
for key, date, *data in book:
d[key][data[0]] = date
if len(data) == 2:
d[key]['reason'] = data[1]
或者,您可以捕获IndexError
而不是测试元组长度:
for key, date, *data in book:
d[key][data[0]] = date
try:
d[key]['reason'] = data[1]
except IndexError:
continue