Question

我有一些可行的代码，但是有点瓶颈，我不知道如何加快速度。它处于循环中，我无法想象如何对其进行矢量化。

我有一个2D数组，vals，表示时间序列数据。行是日期，列是不同的系列。我试图将数据打包几个月来执行各种操作（总和，平均值等）。这是我目前的代码：

allDts; %Dates/times for vals.  Size is [size(vals, 1), 1]
vals;
[Y M] = datevec(allDts);
fomDates = unique(datenum(Y, M, 1)); %first of the month dates

[Y M] = datevec(fomDates);
nextFomDates = datenum(Y, M, DateUtil.monthLength(Y, M)+1);

newVals = nan(length(fomDates), size(vals, 2)); %preallocate for speed

for k = 1:length(fomDates);

下一行是瓶颈，因为我多次调用它。（循环）

    idx = (allDts >= fomDates(k)) & (allDts < nextFomDates(k));
    bucketed = vals(idx, :);
    newVals(k, :) = nansum(bucketed);
end %for

任何想法？提前谢谢。

Answer 1

这是一个很难实现矢量化的问题。我可以使用CELLFUN建议一种方法来做到这一点，但我不能保证它会更快地解决您的问题（您必须自己计算您正在使用的特定数据集）。正如this other SO question中所讨论的那样，矢量化总是比for循环更快。它可能是特定于问题的，这是最好的选择。有了这个免责声明，我建议您尝试两种解决方案：CELLFUN版本和可能运行得更快的for-loop版本的修改。

CELLFUN解决方案：

[Y,M] = datevec(allDts);
monthStart = datenum(Y,M,1);  % Start date of each month
[monthStart,sortIndex] = sort(monthStart);  % Sort the start dates
[uniqueStarts,uniqueIndex] = unique(monthStart);  % Get unique start dates

valCell = mat2cell(vals(sortIndex,:),diff([0 uniqueIndex]));
newVals = cellfun(@nansum,valCell,'UniformOutput',false);

对MAT2CELL的调用将具有相同开始日期的 vals 行分组到单元格数组 valCell 的单元格中。变量 newVals 将是一个长度为 numel（uniqueStarts）的单元格数组，其中每个单元格将包含在相应单元格上执行 nansum 的结果 valCell 。

FOR-LOOP SOLUTION：

[Y,M] = datevec(allDts);
monthStart = datenum(Y,M,1);  % Start date of each month
[monthStart,sortIndex] = sort(monthStart);  % Sort the start dates
[uniqueStarts,uniqueIndex] = unique(monthStart);  % Get unique start dates

vals = vals(sortIndex,:);  % Sort the values according to start date
nMonths = numel(uniqueStarts);
uniqueIndex = [0 uniqueIndex];
newVals = nan(nMonths,size(vals,2));  % Preallocate
for iMonth = 1:nMonths,
  index = (uniqueIndex(iMonth)+1):uniqueIndex(iMonth+1);
  newVals(iMonth,:) = nansum(vals(index,:));
end

Answer 2

如果您需要做的只是在矩阵的行上形成总和或均值，其中行根据另一个变量（日期）求和，然后使用我的合并器函数。它旨在完成此操作，根据指标系列的值减少数据。（实际上，整合者也可以处理n-d数据，并且具有容忍度，但您需要做的就是将月份和年份信息传递给它。）

Find consolidator on the file exchange on Matlab Central

Bucketing算法

2 个答案: