这是一个漫长的昼夜。我可以使用一些帮助,为什么这不工作不能直接思考。这个脚本打开一个.csv文件检查某个单词是否包含在第[4]行的标题中,然后它将遍历包含列表的字典,如果该单词在标题中,那么它将返回字典要写入新csv文件的密钥。
我得到的内容未分类,因为标题确实包含列表中的单词,因此每行都不应该是正确的。
感谢您的帮助。
BrandsMart USA中的CSV示例,http://www.BrandsMartUSA.com,BrandsMart产品目录,01:30.8,Mini Bass King Jr蓝牙便携式户外音箱,2Boom BX320K,2Boom BX320K迷你低音King Jr蓝牙便携式户外音箱,2BOOMBX320K,2Boom,BX320K, 7.08192E + 11 ,, USD ,, 19.88,19.88 ,, http://www.example.com,http://www.ftjcfx.com/image-8299186-11133500,https://www.brandsmartusa.com/images/product/large/20225377.jpg,Bluetooth&对接扬声器,,,,,,,,,,,,,,,,,,,,,, BrandsMart USA,http://www.BrandsMartUSA.com,BrandsMart产品目录,01:30.8,Mini Bass King Jr蓝牙便携式户外音箱,2Boom BX320R,2Boom BX320R Mini Bass King Jr蓝牙便携式户外音箱,2BOOMBX320R,2Boom,BX320R,7.08192E + 11 ,, USD ,, 19.88,19.88 ,, http://www.example.com,http://www.ftjcfx.com/image-8299186-11133500,https://www.brandsmartusa.com/images/product/large/20225393.jpg,Bluetooth&对接扬声器,,,,,,,,,,,,,,,,,,,,,, BrandsMart USA,http://www.BrandsMartUSA.com,BrandsMart产品目录,01:30.8,蓝牙降噪耳塞式耳机,2Boom EPBT690B,2Boom EPBT690B蓝牙降噪耳塞式耳机,2BOOMEPBT690B,2Boom,EPBT690B,7.44751E + 11 ,, USD ,, 9.88 ,9.88 ,, http://www.example.com,http://www.ftjcfx.com/image-8299186-11133500,https://www.brandsmartusa.com/images/product/large/20225398.jpg,Earbuds,,,,,,,,,,,,,,,,yes ,,,
使用Python 3
import os, csv, time
csv_path = os.path.dirname(os.path.abspath(__file__))
row_list = []
# Appliances
category_list = {"Appliance Accessories": ['Air Conditioner Accessories', 'Air Purifier Accessories', "Coffee Maker Accessories",
'Dishwasher Accessories', 'Food Processor Accessories', 'Heater Accessories',
'Humidifier Accessories', 'Humidifier Accessories', 'Mixer Accessories',
'Range & Oven Accessories', 'Range Hood Accessories', 'Refrigerator Accessories',
'Vacuum Accessories', 'Washer & Dryer Accessories'],
"Electronics Accessories": ['cables & adapters', 'audio accessories', 'video accessories', 'camcorder accessories',
'cell phone accessories', 'clock radios', 'Digital Book Reader Accessories',
'Digital Picture Frames', 'Electronics Cases & Bags', 'GPS Accessories',
'Projector Accessories', 'Telephone Accessories', 'batteries', 'battery'],
"Photography": ['Camcorders', 'Cameras', 'Digital Camera Accessories', 'Digital Cameras', 'Camera', 'Digital Camera',
'Photography', 'Darkroom']}
category = ""
with open(csv_path + "/pre.csv") as f:
reader = csv.reader(f)
for row in reader:
for k, val in category_list.items():
for v in val:
if v.lower() in row[4].lower():
category = k
else:
category = "Uncategorized"
new_row = [str(row[0]), # company
str(row[1]), # company url
str(row[4]), # product name
str(row[5]), # keywords
str(row[6]), # descripition
str(row[7]), # sku
str(row[8]), # manufacturer
str(row[13]), # saleprice
str(row[14]), # price
str(row[15]), # retailprice
str(row[17]), # buy_link
str(row[19]), # product_image_url
str(row[31]), # promotional_text
str(row[36]), # stock
str(row[37]), # condition
str(row[38]), # warrenty
str(row[39]), # shipping_cost
category,
]
row_list.append(new_row)
f.close()
with open(csv_path + "/final.csv", 'w') as ff:
writer = csv.writer(ff)
writer.writerows(row_list)
ff.close()
答案 0 :(得分:3)
好吧,这个块只会存储val
中最后一个值的结果:
for v in val:
if v.lower() in row[4].lower():
category = k
else:
category = "Uncategorized"
最后只是比较val[-1]
,因为你要覆盖category
。
您可能希望在找到类别后中断循环,或者在每次迭代时使用此值执行某些操作?
答案 1 :(得分:0)
好的,所以我从上面的每个人那里得到一点点来满足我的需求。谢谢所有帮助过的人。
csv_row_map = [0, # company
1, # company url
4, # product name
5, # keywords
6, # descripition
7, # sku
8, # manufacturer
13, # saleprice
14, # price
15, # retailprice
17, # buy_link
19, # product_image_url
31, # promotional_text
36, # stock
37, # condition
38, # warrenty
39, # shipping_cost
]
product_to_category_index = {}
for category, products in category_list.items():
product_to_category_index.update(((product.lower(), category) for product in products))
with open(csv_path + '/pre.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
for k, v in product_to_category_index.items():
if k in row[4].lower():
category = v
break
else:
category = "Uncategorized"
#category = product_to_category_index.get(row[4].lower(), "Uncategorized")
new_row = [row[csv_row_map[i]] for i in range(len(csv_row_map))]
new_row.append(category)
row_list.append(new_row)
with open(csv_path + "/final.csv", 'w') as ff:
writer = csv.writer(ff)
writer.writerows(row_list)