我有 2 个文件,对于一个文件的每一行,我必须搜索另一个文件以找到相应的信息。第一个文件(类别 1)包含以行分隔的 json 对象,其中每个对象依次包含评论者、商品 ASIN、评分和时间戳(如下所示):
{"overall": 5.0, "verified": true, "reviewTime": "01 5, 2016", "reviewerID": "A2V0JXLJ9VCNNX", "asin": "B00570RQ0A", "reviewerName": "Amazon Customer", "reviewText": "washer washing", "summary": "Five Stars", "unixReviewTime": 1451952000}
另一个文件是元数据,它也包含行分隔的 json 对象,其中每个对象包含产品描述、图像链接和其他产品信息(如下所示):
{"category": ["Appliances", "Parts & Accessories", "Refrigerator Parts & Accessories"], "description": ["Little Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122)Auxiliary condensate switch installs in the condensate drain pan of an air conditioning or refrigeration unit, to turn off the unit if the drain pan approaches overflowLittle Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122) Features: ABS housing Polyethylene float 48VAC/VDC max 5 amps max Low voltage 18\" lead wiresLittle Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122) Specification: Cord Length: 18\" Shut Off: 0 Voltage: 48 VAC/DC Amps: 5 Weight: 0.21 Height: 3 Width: 1.34 Length: 4.55.", "Little Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122)Auxiliary condensate switch installs in the condensate drain pan of an air conditioning or refrigeration unit, to turn off the unit if the drain pan approaches overflowLittle Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122) Features: ABS housing Polyethylene float 48VAC/VDC max 5 amps max Low voltage 18\" lead wiresLittle Giant ACS-2 Auxiliary Condensate Drain Pan Overflow Shut-off Switch, 48 VAC/VDC, 18\" Leads (599122) Specification: Cord Length: 18\" Shut Off: 0 Voltage: 48 VAC/DC Amps: 5 Weight: 0.21 Height: 3 Width: 1.34 Length: 4.55"], "fit": "", "title": "Little Giant 599122 ACS-2 Float Switch with 18-Inch Lead, 1-Pack", "also_buy": [], "image": ["https://images-na.ssl-images-amazon.com/images/I/413p5bagSJL._SS40_.jpg"], "tech2": "", "brand": "Little Giant", "feature": ["Little Giant", "ABS housing, polyethylene float", "72\" lead wires"], "rank": [">#128,783 in Tools & Home Improvement (See top 100)", ">#2,384 in Tools & Home Improvement > Appliances > Large Appliance Accessories > Refrigerator Parts & Accessories"], "also_view": ["B000JGH2TM", "B0026WSD4A", "B000SM342Q", "B003QK4KUM", "B005D4RFEM", "B00DK85P9A", "B000FK9W0E", "B013K33QQI", "B079NQ1532", "B004496WNW", "B01N19NQLN", "B000AHT78O"], "details": {}, "main_cat": "Tools & Home Improvement", "similar_item": "", "date": "October 4, 2007", "price": "$19.65", "asin": "B000WQZFFW"}
基本上,对于第一个文件每一行上的每个项目,我正在搜索元数据以检索与该项目相关的价格信息和产品描述。 目前,我正在使用双 for 循环来实现这一点,如下所示。有什么办法可以更好地优化我的代码?
def get_ecoList(category1, metaCat):
global meta
meta = read_metadata(metaCat)
with open(category1, 'r+') as y:
data = y.readlines()
tempArr = []
idx = 0
for line in data:
metaFlag = 0
currLine = line.split(',')
i = currLine[0]
if i in tempArr:
continue
tempArr.append(i)
items[i] = 0
global prices
prices[idx] = '$1.0' #change to prices[idx] = avg_prices[i]
for k in meta:
if 'asin' in k and k['asin'] == i:
metaFlag = 1
if 'price' in k:
if len(k['price']) > 0:
prices[idx] = k['price']
if 'description' in k:
if len(k['description']) > 1:
k['description'] = ''.join(k['description'])
ecoList.append(k['description'])
elif len(k['description']) == 0:
ecoList.append('N/A')
else:
ecoList.append(k['description'][0])
else:
ecoList.append('N/A')
break
idx += 1
if metaFlag == 0:
ecoList.append('N/A')
prices = [float(re.findall("\d+\.\d+",x)[0]) for x in prices if x != 0]
y.close()