Question

我有两个文本文件，每个文件都有一个第一列，其中包含覆盖相同时间但速率不同的时间戳（第二个文件包含更多样本）。但是，由于缺乏精确性，我收集的一些数据（位于这些文本文件的其他列中）在相同的时间戳内变化。

例如，这是第二个文件的摘录

Timestamp RobotYaw
1375448725 -2.183439
1375448725 -2.173082
1375448725 -2.169424

和第一个文件：

Timestamp State
1375448725 8
1375448725 9

我想要做的是创建一个输出文件，它是第二个文件的副本（有更多样本），但有一个额外的列指定当前状态（在第一个文件中给出）。然而，在上面的情况中，由于时间戳不够精确，我不知道何时发生从状态8到9的变化，并且我已经选择将前一半样本的时间戳设置为状态8，另一半州9。

我正在努力用Perl做这件事。现在，它将始终为具有该时间戳的所有样本提供最新状态（9）...

这是当前程序的一个框架，只保留了最小值（这应该打印第二个文件中的所有时间戳以及当前状态）：

my $state = 1;

while ( my $ligne = <SECONDFILE> )
{
    my ( $timestamp , $junk ) = split ( /\s/ , $ligne ) ;

    while ( my $checkline = <FIRSTFILE> )
    {
        my ( $stateTimestamp , $stateJunk ) = split ( /\s/ , $checkline ) ;
        if ( int($timestamp) < int($stateTimestamp) )
        {
            last;
        }
        else
        {
            $state = $stateJunk;
        }
    }
    print OUTPUT $timestamp;
    print OUTPUT " ";
    print OUTPUT $state;
    print OUTPUT "\n";
}

您是否知道如何在一个时间戳内轻松处理这种任意状态变化？由于很难解释，感谢您的阅读，如果您有任何疑问，我很乐意进一步解释......

Answer 1

要知道要打印哪个样本的偏航时间戳，我们需要将每个文件读入一个哈希值。我们可以计算有多少偏航时间戳和多少样本时间戳，以便我们可以计算出要使用的样本数。这应该有用：

    my %first_file;
    my %second_file;

    ### Store in a hash the unique timestamp|state values in order. 
    while ( my $checkline = <FIRSTFILE> ) {
        my ( $stateTimestamp, $statenum ) = split /\s/, $checkline;
        $first_file{$stateTimestamp}{'number'}++;
        my $count = $first_file{$stateTimestamp}{'number'};
        $first_file{$stateTimestamp}{$count} = $statenum;
    }

    ### Store timestamp and yaw values
    while ( my $ligne = <SECONDFILE> ) {
        my ( $timestamp , $robotyaw ) = split /\s/, $ligne;
        $second_file{$timestamp}{'number'}++;
        $second_file{$timestamp}{$robotyaw} = $second_file{$timestamp}{'number'};
    }

    ### MATH TIME IS FUN TIME. Don't forget to sort those hashes!
    foreach ( sort { $a <=> $b } keys %second_file ) {
        my $tstamp = $_;
        my $valhash = $second_file{$tstamp};
        my $change_count = round_up($second_file{$tstamp}{'number'} / $first_file{$tstamp}{'number'});
        foreach ( sort { ${$valhash}{$a} <=> ${$valhash}{$b} } keys %{$valhash} ) {
            my $yaw = $_;
            if ($yaw eq 'number') { next; }
            my $state_num = round_up(${$valhash}{$yaw} / $change_count);
            print OUTPUT "$tstamp $yaw ", $first_file{$tstamp}{$state_num}, "\n";
        }
    }    

    ### We could have used POSIX's ceil here instead  
    sub round_up {
        my $n = shift;
        return(($n == int($n)) ? $n : int($n + 1));
    }

您还应该习惯使用3参数open filehandle ::

open my $firstfile, '<', 'first_file.txt' or die "Can't open first file: $!";
open my $output, '>', 'output_file.txt' or die "Can't open output file: $!";

通过这种方式，它们的范围很合适。

文本文件处理时间戳不够精确

1 个答案: