1     5     100       
2     10    100      
3     5     300    
4     15    200     
5     5     500    
6     15    0    
7     10    400    
8     5     300   
9     10    200    
10    10    0    
11    5     300
12    10    100
13    15    1000    
...   ...   ...
T     ...   ...


第1行:在时间t = 1时,新订单到货,此订单的价格为5,此价格所需的单位总数为100。

第2行:,时间t = 2 ...订单价格为10 - 价格10的总需求为100

摘要:在时间t = 2时,价格为5需要100个单位,价格为10需要100个

第3行:时间3 ...订单价格为5,需要额外 200个单位,因此价格5所需的单位总数为300

摘要:在时间t = 3时,价格为5的单位为300,价格为10的单位为100

第4行: t = 4 ...订单价格为15,200个单位,15的总需求为200

摘要:t = 4有5个需要300个单位,10个需要100个,15个需要200个


摘要:t = 5需要500个单位,价格为5,100个,10个,200个,15个

第6行: t = 6,价格为15,但第三列有0个单位,表示订单已取消,价格没有需求15

摘要:t = 6需要500个单位,价格为5,100个单位需要10个

我想将数据分配给以下两个Tx3矩阵,其中每一行代表上面的“ Summary:”行:

           [Price=5][Price = 10][Price = 15]
[Time = 1]       5      NaN      NaN
[Time = 2]       5      10       NaN
[Time = 3]       5      10       NaN
[Time = 4]       5      10       15
[Time = 5]       5      10       15
[Time = 6]       5      10       NaN
[Time = 7]       5      10       NaN
[Time = 8]       5      10       NaN
[Time = 9]       5      10       NaN
[Time = 10]      5      NaN      NaN 
[Time = 11]      5      NaN      NaN 
[Time = 12]      5      10       NaN
[Time = 13]      5      10       15
      ...       ...     ...      ...
[Time = T]      ...     ...      ...

           [Price=5][Price = 10][Price = 15]
[Time = 1]       100      NaN       NaN
[Time = 2]       100      100       NaN
[Time = 3]       300      100       NaN
[Time = 4]       300      100       200
[Time = 5]       500      100       200
[Time = 6]       500      100       NaN
[Time = 7]       500      400       NaN
[Time = 8]       300      400       NaN
[Time = 9]       300      200       NaN
[Time = 10]      300      NaN       NaN 
[Time = 11]      300      NaN       NaN 
[Time = 12]      300      100       NaN
[Time = 13]      300      100       1000
      ...        ...      ...       ...
[Time = T]       ...      ...       ...

基本上,上面的两个矩阵允许我获得任何“时间”点的“价格”和“单位”。注意,每个“价格”可能具有一旦“单位”为0时出现的不连续性 - 因此“价格= 15”仅出现在t = 4并且仅存在两个时期:t = 4,t = 5(订单是在t = 6时取消)在t = 13时再次出现。



Data=sortrows(Data, [2 1]);
[~,~, IndexPrice]=unique(Data(:,2));

Data=                      IndexPrice=

1     5     100                   1
3     5     300                   1
5     5     500                   1
8     5     300                   1 
11    5     300                   1
2     10    100                   2
7     10    400                   2
9     10    200                   2
10    10    0                     2
12    10    100                   2
4     15    200                   3
6     15    0                     3
13    15    1000                  3
...   ...   ...                  ...
T     ...   ...                  ...


OutputPrice=NaN(size(Data,1), max(IndexPrice));             %Preallocate matrix
for j=1:max(IndexPrice)                                     %Go column-wise
    TempData=Data(IndexPrice==j,:);                         %Submatrix for unique "price"
    for i=1:size(TempData,1)
        if TempData(i,3)~=0                                 %Check for discontinuity (0 in col 3)
            OutputPrice(TempData(i,1):end,j)=TempData(1,2); %Fill wiht values
            OutputPrice(TempData(i,1):end,j)=NaN;           % If there is 0 fill with NaNs

OutputUnits=NaN(size(Data,1), max(IndexPrice)); 
for j=1:max(IndexPrice)
    for i=1:size(TempData,1)
        if TempData(i,3)~=0
            OutputUnits(TempData(i,1):end,j)=TempData(i,3); %The "units" change in contrast to the "prices"

关键点当然是代码的性能 - 它似乎是解决问题的“蛮力”方法。我希望有任何关于更有效解决方法的建议。

我不认为这个版本比你的版本更清晰,但它是对数线性的而不是二次的,因此它将显示大型数据集的性能改进。这个想法是为每个价格构建一个向量,其行数与Data相同,并且对于每个条目,将给出上次订购此价格时的价值。这是第posOfDemands(idxLastDemand(hasLastDemand))行。 [顺便说一下:这实际上就是你earlier questions之一的答案。在price==5的示例中,这将生成向量 [1 1 3 3 5 5 5 8 8 8 11 11 11]。使用此向量,我们得到最后一个demands / prices,如果它们为零,则必须用NaN替换它们:

%%// Rename the variables
prices = Data(:,2);
demands = Data(:,3);
%%// Find number of different prices
uniquePrices = unique(prices);
nUniquePrices = length(uniquePrices);
nData = size(prices,1);
[OutputUnits, OutputPrices] = deal(zeros(nData,nUniquePrices));
%%// For each price do:
for i = 1:nUniquePrices
    %%// Find positions of all demands
    posOfDemands = find(prices==uniquePrices(i));
    idxLastDemand = cumsum(prices==uniquePrices(i));
    hasLastDemand = idxLastDemand~=0;
    %%// Get the values of the last demands/prices
    OutputUnits(hasLastDemand,i) = demands(posOfDemands(idxLastDemand(hasLastDemand)));
    OutputPrices(hasLastDemand,i) = prices(posOfDemands(idxLastDemand(hasLastDemand)));
%%// Convert 0s to NaNs
OutputPrices(OutputUnits == 0) = NaN;
OutputUnits(OutputUnits == 0) = NaN;



prices = Data(:,2);
demands = Data(:,3);
uniquePrices = unique(prices);
nUniquePrices = length(uniquePrices);
%%// Introduce leading demands of value 0 to get the zeros in the beginning
isDemanded = [true(1,nUniquePrices); bsxfun(@eq, prices, uniquePrices.')];
demands = [0; demands];
%%// Find positions of all demands
[rowOfDemands,ignore_] = find(isDemanded);
idxLastDemand = reshape(cumsum(isDemanded(:)),[],nUniquePrices);
%%// Get the values of the last demands/prices
OutputUnits = demands(rowOfDemands(idxLastDemand(2:end,:)));
OutputUnits(OutputUnits == 0) = NaN;
OutputPrices = ones(size(OutputUnits,1),1)*uniquePrices(:).';
OutputPrices(isnan(OutputUnits)) = NaN;