如何以有限的精度进行外出

时间:2016-10-18 16:47:56

标签: matlab timestamp outer-join

我想加入关于时间价值的表格。由于时间戳在值之间略有不同,我想提供低于两个时间戳差异的绝对阈值被认为是相同的。

添加了一个mwe来说明我的意思:

PreciseJoin = 

        Time        A     B 
    ____________    _    ___

    1476369169.1    1      1
    1476369169.2    2    NaN
    1476369169.3    3      3
    1476369169.4    4      4
    1476369169.5    5      5


ErrorJoin = 

          Time           A      B 
    ________________    ___    ___

        1476369169.1      1    NaN
     1476369169.1095    NaN      1
        1476369169.2      2    NaN
        1476369169.3      3    NaN
    1476369169.30034    NaN      3
        1476369169.4      4    NaN
    1476369169.40439    NaN      4
        1476369169.5      5    NaN
    1476369169.50382    NaN      5

导致:

drawableStart

现在我希望第二个表看起来像第一个,即使时间列中存在细微差别。这可能吗?

1 个答案:

答案 0 :(得分:1)

如果你有R2016b,这是新timetable方法synchronize的理想任务。

tbase = seconds([1476369169.1, 1476369169.2, 1476369169.3, 1476369169.4, 1476369169.5]);
t1 = tbase + seconds(rand(size(tbase)) / 100);
t2 = tbase + seconds(rand(size(tbase)) / 100);

TimetableA = timetable((1:5)', 'VariableNames', {'A'}, 'RowTimes', t1);
TimetableB = timetable((1:5)', 'VariableNames', {'B'}, 'RowTimes', t2);

combined = synchronize(TimetableA, TimetableB, tbase, 'nearest')

结果:

>> combined
combined = 
          Time          A    B
    ________________    _    _
    1476369169.1 sec    1    1
    1476369169.2 sec    2    2
    1476369169.3 sec    3    3
    1476369169.4 sec    4    4
    1476369169.5 sec    5    5

啊哈,在评论之后,我意识到我错过了“缺失值”问题。实际上,这意味着使用ismembertol可能更适合与R2015a兼容的解决方案。这是对最初提出的问题的轻微扩展:

% Use a somewhat extended "base" time-scale
tbase = 1476369169 + (0:0.1:1)';

% Add noise to t1 and t2, selecting different fundamental
% elements from 'tbase'
t1 = tbase(1:7) + (rand(size(tbase(1:7))) / 100);
t2 = tbase(2:2:end) + (rand(size(tbase(2:2:end))) / 100);

% Work out which elements of t1 and t2 are members of tbase, within
% tolerance of 0.01. Use DataScale == 1 for absolute tolerance.
% In each case, the '_lia' output tells us whether the time
% vector is present in 'tbase'; and '_locB' tells us where
% in 'tbase' each element exists (or 0 if the corresponding element
% of '_lia' is false).
[t1_lia, t1_locB] = ismembertol(t1, tbase, 0.01, 'DataScale', 1);
[t2_lia, t2_locB] = ismembertol(t2, tbase, 0.01, 'DataScale', 1);

% Build tables that we can join together.
TA = table((1:numel(t1))', t1_locB, t1, 'VariableNames', {'A', 'locB', 'time'})
TB = table((1:numel(t2))', t2_locB, t2, 'VariableNames', {'B', 'locB', 'time'})

% Filter TA and TB to contain only rows which match 'tbase'
TA = TA(t1_lia, :);
TB = TB(t2_lia, :);

% Join these by location in the common time-base
TAB = outerjoin(TA, TB, 'Keys', {'locB'}, 'MergeKeys', true);
TAB.time = tbase(TAB.locB);
% Don't need the 'locB' variable in this table
TAB.locB = [];
TAB

对我来说,为TAB生成以下输出:

TAB = 
     A         time_TA          B         time_TB             time    
    ___    ________________    ___    ________________    ____________
      1    1476369169.00123    NaN                 NaN      1476369169
      2    1476369169.10184      1    1476369169.10491    1476369169.1
      3     1476369169.2024    NaN                 NaN    1476369169.2
      4    1476369169.30417      2    1476369169.30489    1476369169.3
      5     1476369169.4005    NaN                 NaN    1476369169.4
      6    1476369169.50903      3    1476369169.50338    1476369169.5
      7    1476369169.60945    NaN                 NaN    1476369169.6
    NaN                 NaN      4      1476369169.709    1476369169.7
    NaN                 NaN      5    1476369169.90369    1476369169.9

注意我在这里保留了A和B的实际时间。