Question

我遇到了一个需要帮助的问题。我有一个嵌套列表结构，其中内部列表包含有关股票的信息。周末缺少这些数据，我需要通过近似来填补这些空白。相应地进行近似：

如果某一天的交易量/价格为 x，而下一个可用数据点为 y，则 x 之后第二天的近似值由 (x+y)/2 计算，直到所有缺口都被填补。< /p>

我的容器列表：

tab2 <- cbind(data.frame(mAs=c(as.numeric(levels(tabelle$Treatment)))),
              data.frame(Area=c(tabelle$Area)),
              data.frame(StdErr=c(tabelle$StdErr)),
              data.frame(CILower=c(tabelle$CILower)),
              data.frame(CIUpper=c(tabelle$CIUpper)))

我的数据中的空白是以下日期：

stocks = [['Stock', 'Date', 'Volume', 'Price'], ['GME', 1, 6218300, 184.5], ['GME', 2, 4768300, 177.970001], ['GME', 3, 10047400, 170.259995], ['GME', 4, 9442000, 158.360001], ['GME', 7, 16683600, 141.089996], ['GME', 8, 6806900, 140.990005], ['GME', 9, 21138100, 166.529999], ['GME', 10, 7856800, 156.440002], ['GME', 11, 5139700, 154.690002], ['GME', 14, 10520200, 164.369995], ['GME', 15, 4658600, 158.529999]]

我想要的输出示例：

missing_values = [5, 6, 12, 13]

到目前为止我已经尝试过，但失败了；

updated_stocks = [['Stock', 'Date', 'Volume', 'Price'], ... ['GME', 4, 9442000, 158.360001], ['GME', 5, 13062800.0, 149.72499850000003], ['GME', 6,  ...]

希望得到任何意见！巧

Answer 1

我添加了一个额外的步骤，将嵌套列表转换为字典列表，因为我觉得访问这些值更容易。我正在使用一个名为“get_closest_stock”的新函数，我基本上按照与给定日期的接近程度对现有股票进行排序，并返回最接近的结果。

我创建了一个副本，您可以在其中查看解决方案 https://replit.com/@beesperester/FuchsiaVariableTelecommunication

def get_closest_stock(stock_date, stock_dicts):
    sorted_stock_dicts = sorted(stock_dicts, key=lambda x: abs(stock_date - x["Date"]))

    return sorted_stock_dicts[0]


stocks = [
    ['Stock', 'Date', 'Volume', 'Price'],
    ['GME', 1, 6218300, 184.5],
    ['GME', 2, 4768300, 177.970001],
    ['GME', 3, 10047400, 170.259995],
    ['GME', 4, 9442000, 158.360001],
    ['GME', 7, 16683600, 141.089996],
    ['GME', 8, 6806900, 140.990005],
    ['GME', 9, 21138100, 166.529999],
    ['GME', 10, 7856800, 156.440002],
    ['GME', 11, 5139700, 154.690002],
    ['GME', 14, 10520200, 164.369995],
    ['GME', 15, 4658600, 158.529999]]

missing_values = [5, 6, 12, 13]

# pop first list from nested list to remove the column names
columns = stocks.pop(0)

# convert nested list to list of dicts for easier value access
stock_dicts = []

for stock in stocks:
    stock_dict = dict(zip(columns, stock))

    stock_dicts.append(stock_dict)

# create average stock dates
for index, missing_value in enumerate(missing_values):
    # get previous stock from stock dicts
    previous_stock = get_closest_stock(missing_value - 1, stock_dicts)

    # get next stock from stock dicts
    next_stock = get_closest_stock(missing_value + 1, stock_dicts)

    # create new dict with averaged values from previous_stock
    # and next stock
    average_stock = {
        **previous_stock,
        "Date": missing_value,
        "Volume": (
            (previous_stock["Volume"] + next_stock["Volume"]) / 2.0
        ),
        "Price": (
            (previous_stock["Price"] + next_stock["Price"]) / 2.0
        )
    }

    # append averaged stock
    stock_dicts.append(average_stock)

# sort stocks by date
stock_dicts_sorted = sorted(stock_dicts, key=lambda x: x["Date"])

# convert back to nested lists to match desired output
stocks_including_average_dates = [columns] + [list(x.values()) for x in stock_dicts_sorted]

print(stocks_including_average_dates)

输出如下：

[['Stock', 'Date', 'Volume', 'Price'], ['GME', 1, 6218300, 184.5], ['GME', 2, 4768300, 177.970001], ['GME', 3, 10047400, 170.259995], ['GME', 4, 9442000, 158.360001], ['GME', 5, 13062800.0, 149.72499850000003], ['GME', 6, 14873200.0, 145.40749725], ['GME', 7, 16683600, 141.089996], ['GME', 8, 6806900, 140.990005], ['GME', 9, 21138100, 166.529999], ['GME', 10, 7856800, 156.440002], ['GME', 11, 5139700, 154.690002], ['GME', 12, 7829950.0, 159.52999849999998], ['GME', 13, 9175075.0, 161.94999674999997], ['GME', 14, 10520200, 164.369995], ['GME', 15, 4658600, 158.529999]]

迭代嵌套列表结构并近似缺失值

1 个答案: