我有一个产品列表,其中包含许多具有id,image_url等属性的对象。如下所示:
total_products
[{u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQCG1ObwtCgqxZIk&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000000.png&cfs=1&_nc_hash=AQAPdo31zo9WJk8j', u'id': u'1539966686030963', u'retailer_id': u'product-1000000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQDyc-Yyic5QLOqH&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F0.png&cfs=1&_nc_hash=AQDhmhPJxFZEpMFX', u'id': u'993388404100117', u'retailer_id': u'product-0'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQAwTzrzAjdKFjmB&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000.png&cfs=1&_nc_hash=AQCMMJRJ_r7QB06I', u'id': u'642820939176165', u'retailer_id': u'product-1000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQBHdbRqB7F6aMKM&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1.png&cfs=1&_nc_hash=AQDx7P52g0NYBB-3', u'id': u'1411912028843607', u'retailer_id': u'product-1'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB7aSPmk_j21umz&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F100000.png&cfs=1&_nc_hash=AQAPV5oe_ymaAcXr', u'id': u'942522339181104', u'retailer_id': u'product-100000'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB69V2cgASUIci1&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F100.png&cfs=1&_nc_hash=AQAk3eZ4vqWYbOW4', u'id': u'1347112758661660', u'retailer_id': u'product-100'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQD44rjEUMk6Yp2H&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000001.png&cfs=1&_nc_hash=AQBT_0iB417B08ux', u'id': u'1354204821311003', u'retailer_id': u'product-1000001'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQB4ucqXEbo2DyC7&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F1000002.png&cfs=1&_nc_hash=AQAQ2vuj0WmuXSqw', u'id': u'1776841739008769', u'retailer_id': u'product-1000002'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQBM75VZTNuxqaoq&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F10.png&cfs=1&_nc_hash=AQAUdkc6II5eu47D', u'id': u'1358784964179738', u'retailer_id': u'product-10'}, {u'image_url': u'https://external.xx.fbcdn.net/safe_image.php?d=AQAY0kmVnHXBbhHe&url=http%3A%2F%2Fgigya.jp%2Fdpa%2F10000.png&cfs=1&l&_nc_hash=AQCT1PHl5h1Rhc5r', u'id': u'1337513966312571', u'retailer_id': u'product-10000'}]
我正在阅读包含以下数据的csv文件; -
正如您所看到的,csv_file id
和retailer_id
中的ID对于某些产品是相同的 -
所以如果image_link
和csv file
匹配,我想更改retailer_id
中的id
。
这样做我会逐行阅读csv文件并循环浏览total_products
中的所有产品,如果找到任何匹配项,则更改image_link
代码:
def update_csv(file):
print file
reader = csv.DictReader(open(file))
out_file_name = str(file).replace(".csv", "")
writer = csv.DictWriter(open(out_file_name+"_updated.csv","wb"),fieldnames=reader.fieldnames)
writer.writeheader()
for current_row in reader:
for product in total_products:
retailer_id = product['retailer_id']
if(current_row['id']==retailer_id):
current_row['image_link']= "RajSharma"
print "Match = "+str(retailer_id)+" in "+file
break
writer.writerow(current_row)
这种方法的问题是,如果total_products
包含超过1000-10,000,则运行时间过长。
有没有办法在retailer_id
中找到total_products
,如果有,请更改image_link
?
答案 0 :(得分:3)
首先,从total_products
:
product_ids = set([product['retailer_id'] for product in total_products])
然后,检查current_row['id']
是否在集合中:
for current_row in reader:
if current_row['id'] in product_ids:
current_row['image_link'] = 'RajSharma'
设置搜索速度要快得多,我们只需要一个唯一的产品ID列表进行检查。
使用if current_row['image_link'] in product_ids
利用底层C代码进行循环,优化对集合中值的检查。