假设我有一些时间戳,价格和数量的数据。此数据可能非常大,并且匹配条件可能发生在组中的任何位置。一个简单的例子如下所示:
[{"date":1387496043,"price":19.379,"amount":1.000000}
{"date":1387496044,"price":20.20,"amount":2.00000}
{"date":1387496044,"price":10.00,"amount":0.10000}
{"date":1387496044,"price":20.20,"amount":0.300000}]
我如何对此进行排序,以便合并具有相同时间戳和相同价格的任何项目的金额?
所以结果看起来像(注意2.0和0.3的数量已经汇总在一起):
[{"date":1387496043,"price":19.379,"amount":1.000000}
{"date":1387496044,"price":20.20,"amount":2.30000}
{"date":1387496044,"price":10.00,"amount":0.10000}]
我尝试了许多复杂的方法(使用Python 2.7.3),但我不太了解python。我确信有一种很好的方法可以找到2个匹配的值,然后用新的数量更新一个并删除副本。
仅供参考这是测试数据
L=[{"date":1387496043,"price":19.379,"amount":1.000000},{"date":1387496044,"price":20.20,"amount":2.00000},{"date":1387496044,"price":10.00,"amount":0.10000},{"date":1387496044,"price":20.20,"amount":0.300000}]
答案 0 :(得分:2)
基于defaultdict的方法
from collections import defaultdict
d = defaultdict(float)
z = [{"date":1387496043,"price":19.379,"amount":1.000000},
{"date":1387496044,"price":20.20,"amount":2.00000},
{"date":1387496044,"price":10.00,"amount":0.10000},
{"date":1387496044,"price":20.20,"amount":0.300000}]
for x in z:
d[x["date"], x["price"]] += x["amount"]
print [{"date": k1, "price": k2, "amount": v} for (k1, k2), v in d.iteritems()]
[{'date': 1387496044, 'price': 10.0, 'amount': 0.1},
{'date': 1387496044, 'price': 20.2, 'amount': 2.3},
{'date': 1387496043, 'price': 19.379, 'amount': 1.0}]
答案 1 :(得分:1)
执行此操作的最佳方法可能是将(日期,价格)字典作为键。如果您遇到重复的密钥,则可以组合字段以保持密钥的唯一性。
def combine(L):
results = {}
for item in L:
key = (item["date"], item["price"])
if key in results: # combine them
results[key] = {"date": item["date"], "price": item["price"], "amount": item["amount"] + results[key]["amount"]}
else: # don't need to combine them
results[key] = item
return results.values()
对于您的示例,这将是一个稍微混乱的O(n)解决方案,显然可以推广以解决您的初始问题。
答案 2 :(得分:1)
FWIW你可以使用数据库操作来做到这一点:
records = [
{"date":1387496043,"price":19.379,"amount":1.000000},
{"date":1387496044,"price":20.20,"amount":2.00000},
{"date":1387496044,"price":10.00,"amount":0.10000},
{"date":1387496044,"price":20.20,"amount":0.300000},
]
import sqlite3
db = sqlite3.connect(':memory:')
db.row_factory = sqlite3.Row
db.execute('CREATE TABLE records (date int, price float, amount float)')
db.executemany('INSERT INTO records VALUES (:date, :price, :amount)', records)
sql = 'SELECT date, price, SUM(amount) AS amount FROM records GROUP BY date, price'
records = [dict(row) for row in db.execute(sql)]
print(records)