我遇到了一个需要帮助的问题。我有一个嵌套列表结构,其中内部列表包含有关股票的信息。周末缺少这些数据,我需要通过近似来填补这些空白。相应地进行近似:
如果某一天的交易量/价格为 x,而下一个可用数据点为 y,则 x 之后第二天的近似值由 (x+y)/2 计算,直到所有缺口都被填补。< /p>
我的容器列表:
tab2 <- cbind(data.frame(mAs=c(as.numeric(levels(tabelle$Treatment)))),
data.frame(Area=c(tabelle$Area)),
data.frame(StdErr=c(tabelle$StdErr)),
data.frame(CILower=c(tabelle$CILower)),
data.frame(CIUpper=c(tabelle$CIUpper)))
我的数据中的空白是以下日期:
stocks = [['Stock', 'Date', 'Volume', 'Price'], ['GME', 1, 6218300, 184.5], ['GME', 2, 4768300, 177.970001], ['GME', 3, 10047400, 170.259995], ['GME', 4, 9442000, 158.360001], ['GME', 7, 16683600, 141.089996], ['GME', 8, 6806900, 140.990005], ['GME', 9, 21138100, 166.529999], ['GME', 10, 7856800, 156.440002], ['GME', 11, 5139700, 154.690002], ['GME', 14, 10520200, 164.369995], ['GME', 15, 4658600, 158.529999]]
我想要的输出示例:
missing_values = [5, 6, 12, 13]
到目前为止我已经尝试过,但失败了;
updated_stocks = [['Stock', 'Date', 'Volume', 'Price'], ... ['GME', 4, 9442000, 158.360001], ['GME', 5, 13062800.0, 149.72499850000003], ['GME', 6, ...]
希望得到任何意见! 巧
答案 0 :(得分:1)
我添加了一个额外的步骤,将嵌套列表转换为字典列表,因为我觉得访问这些值更容易。我正在使用一个名为“get_closest_stock”的新函数,我基本上按照与给定日期的接近程度对现有股票进行排序,并返回最接近的结果。
我创建了一个副本,您可以在其中查看解决方案 https://replit.com/@beesperester/FuchsiaVariableTelecommunication
def get_closest_stock(stock_date, stock_dicts):
sorted_stock_dicts = sorted(stock_dicts, key=lambda x: abs(stock_date - x["Date"]))
return sorted_stock_dicts[0]
stocks = [
['Stock', 'Date', 'Volume', 'Price'],
['GME', 1, 6218300, 184.5],
['GME', 2, 4768300, 177.970001],
['GME', 3, 10047400, 170.259995],
['GME', 4, 9442000, 158.360001],
['GME', 7, 16683600, 141.089996],
['GME', 8, 6806900, 140.990005],
['GME', 9, 21138100, 166.529999],
['GME', 10, 7856800, 156.440002],
['GME', 11, 5139700, 154.690002],
['GME', 14, 10520200, 164.369995],
['GME', 15, 4658600, 158.529999]]
missing_values = [5, 6, 12, 13]
# pop first list from nested list to remove the column names
columns = stocks.pop(0)
# convert nested list to list of dicts for easier value access
stock_dicts = []
for stock in stocks:
stock_dict = dict(zip(columns, stock))
stock_dicts.append(stock_dict)
# create average stock dates
for index, missing_value in enumerate(missing_values):
# get previous stock from stock dicts
previous_stock = get_closest_stock(missing_value - 1, stock_dicts)
# get next stock from stock dicts
next_stock = get_closest_stock(missing_value + 1, stock_dicts)
# create new dict with averaged values from previous_stock
# and next stock
average_stock = {
**previous_stock,
"Date": missing_value,
"Volume": (
(previous_stock["Volume"] + next_stock["Volume"]) / 2.0
),
"Price": (
(previous_stock["Price"] + next_stock["Price"]) / 2.0
)
}
# append averaged stock
stock_dicts.append(average_stock)
# sort stocks by date
stock_dicts_sorted = sorted(stock_dicts, key=lambda x: x["Date"])
# convert back to nested lists to match desired output
stocks_including_average_dates = [columns] + [list(x.values()) for x in stock_dicts_sorted]
print(stocks_including_average_dates)
输出如下:
[['Stock', 'Date', 'Volume', 'Price'], ['GME', 1, 6218300, 184.5], ['GME', 2, 4768300, 177.970001], ['GME', 3, 10047400, 170.259995], ['GME', 4, 9442000, 158.360001], ['GME', 5, 13062800.0, 149.72499850000003], ['GME', 6, 14873200.0, 145.40749725], ['GME', 7, 16683600, 141.089996], ['GME', 8, 6806900, 140.990005], ['GME', 9, 21138100, 166.529999], ['GME', 10, 7856800, 156.440002], ['GME', 11, 5139700, 154.690002], ['GME', 12, 7829950.0, 159.52999849999998], ['GME', 13, 9175075.0, 161.94999674999997], ['GME', 14, 10520200, 164.369995], ['GME', 15, 4658600, 158.529999]]