我正在尝试从列表中删除重复项,然后再写入JSON文件。我注释了实现代码的行,并添加了额外的打印语句以进行调试。根据我的调试,代码也不会到达打印语句,也不会写入JSON文件。我的错误在trendingBot()函数中。目前,由于代码中没有注释任何内容,因此重复项将被写入JSON文件。
def idx(x):
idx = pd.MultiIndex.from_product([x.index.get_level_values(0).unique(), x.index.get_level_values(1).unique(), new_index])
return idx
pd.concat([y.reindex(idx(y)).interpolate() for _,y in df_mi.groupby(level=[0,1])])
value
1 1 1 1.0
2 1.5
3 2.0
4 1.5
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
2 1 NaN
2 NaN
3 2.0
4 2.0
5 2.0
6 1.5
7 1.0
8 0.5
9 0.0
条目重复的Json文件
convertToJson(quote_name, quote_price, quote_volume, url)
quotesArr = []
# Convert to a JSON file
def convertToJson(quote_name, quote_price, quote_volume, url):
quoteObject = {
"url": url,
"Name": quote_name,
"Price": quote_price,
"Volume": quote_volume
}
quotesArr.append(quoteObject)
def trendingBot(url, browser):
browser.get(url)
trending = getTrendingQuotes(browser)
for trend in trending:
getStockDetails(trend, browser)
# requests finished, write json to file
# REMOVE ANY DUPLICATE url from the list, then write json to file.
quotesArr_dict = {quote['url']: quote for quote in quotesArr}
# print(quotesArr_dict)
quotesArr = list(quotesArr_dict.values())
# print(quotesArr)
with open('trendingQuoteData.json', 'w') as outfile:
json.dump(quotesArr, outfile)
答案 0 :(得分:2)
如果您只想从列表中删除重复项,则可以这样操作:
firstlist = [
{
"url": "https://web.tmxmoney.com/quote.php?qm_symbol=ACB&locale=EN",
"Volume": "Volume:\n12,915,903",
"Price": "$ 7.67",
"Name": "Aurora Cannabis Inc."
},
{
"url": "https://web.tmxmoney.com/quote.php?qm_symbol=HNL&locale=EN",
"Volume": "Volume:\n548,038",
"Price": "$ 1.60",
"Name": "Horizon North Logistics Inc."
},
{
"url": "https://web.tmxmoney.com/quote.php?qm_symbol=ACB&locale=EN",
"Volume": "Volume:\n12,915,903",
"Price": "$ 7.67",
"Name": "Aurora Cannabis Inc."
}
]
newlist=[]
for i in firstlist:
if i not in newlist:
newlist.append(i)
json.dumps(newlist)
>>>[{"url": "https://web.tmxmoney.com/quote.php?qm_symbol=ACB&locale=EN", "Volume": "Volume:\n12,915,903", "Price": "$ 7.67", "Name": "Aurora Cannabis Inc."}, {"url": "https://web.tmxmoney.com/quote.php?qm_symbol=HNL&locale=EN", "Volume": "Volume:\n548,038", "Price": "$ 1.60", "Name": "Horizon North Logistics Inc."}]
我使用json.dumps向您显示return语句,但是如果您使用json.dump将其写入文件,那么它也可以工作。我也测试过。 jsut没有提供漂亮的return语句。
答案 1 :(得分:1)
我会尝试使用实际的循环而不是字典理解
quote_dict = dict()
for quote in quotesArr:
url = quote['url']
if url not in quote_dict:
quote_dict[url] = quote # Only add if url is not already in dict
with open('trendingQuoteData.json', 'w') as outfile:
json.dump(list(quotesArr_dict.values()), outfile)
我将创建一个至少实现Quote
的{{1}}类,而不是词典,以便您可以确定相等性。
答案 2 :(得分:0)
最简单的方法是将其转换为set
,然后将其转换回list
:
mylist = [1,2,3,1,2,3]
mylist2 = list(set(mylist))
print(mylist)
print(mylist2)
这将是输出:
[1, 2, 3, 1, 2, 3]
[1, 2, 3]