想生成报告,其中计算天数,物料在仓库中。
天数是材料进入的日期($3 field)
之间的差异
反对(01 OCT 2014)
手动Feed日期。
Input.csv
Des11,Material,DateIN,Des22,Des33,MRP,Des44,Des55,Des66,Location,Des77,Des88
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,05-FEB-14.09:02:09,cc,dd,x20,ee,ff,gg,YY250,hh,jj
aa,yyy,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
目前我正在使用以下命令来平衡人口老化 - 在13美元的领域(thanks to gboffi
)没有天数
awk -F, 'NR>0 {date=$3;
gsub("[-.]"," ",date);
printf $0 ",";system("date --date=\"" date "\" +%s")}
' Input.csv | awk -F, -v OFS=, -v now=`date --date="01 OCT 2014 " +%s` '
NR>0 {$13=now-$13; $13=$13/24/3600;print $0}' >Op_Step11.csv
在Cygwin(windows)中使用上述命令时,它会采用50 minutes for 1 Lac (1,00,000)
行样本输入。
由于我的实际输入文件包含25 million rows of lines
,因此脚本似乎需要几天时间,
寻找你的建议,以改善命令和建议!!!
预期产出:
Des11,Material,DateIN,Des22,Des33,MRP,Des44,Des55,Des66,Location,Des77,Des88,Ageing-NoOfDays
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj,42.6611
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj,42.6611
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj,109.621
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj,109.621
aa,yyy,05-FEB-14.09:02:09,cc,dd,x20,ee,ff,gg,YY250,hh,jj,237.624
aa,yyy,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj,237.624
aa,zzz,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj,237.624
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
我无权更改输入格式,也没有perl& amp; python访问。
更新#3:
BEGIN{ FS=OFS=","}
{
t1=$3
t2="01-OCT-14.00:00:00"
print $0,(cvttime(t2) - cvttime(t1))/24/3600
}
function cvttime(t, a) {
split(t,a,"[-.:]")
match("JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC",a[2])
a[2] = sprintf("%02d",(RSTART+2)/3)
return( mktime("20"a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
}
答案 0 :(得分:2)
由于您使用的是cygwin,因此您使用的是GNU awk,它具有自己的内置时间函数,因此您无需尝试使用shell date
命令。只需调整我所说的旧命令以适合您的输入和输出格式:
function cvttime(t, a) {
split(t,a,"[/:]")
match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])
a[2] = sprintf("%02d",(RSTART+2)/3)
return( mktime(a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
}
BEGIN{
t1="01/Dec/2005:00:04:42"
t2="01/Dec/2005:17:14:12"
print cvttime(t2) - cvttime(t1)
}
它使用GNU awk作为时间函数,请参阅http://www.gnu.org/software/gawk/manual/gawk.html#Time-Functions
答案 1 :(得分:0)
这是Perl中的一个例子:
use feature qw(say);
use strict;
use warnings;
use Text::CSV;
use Time::Piece;
my $csv = Text::CSV->new;
my $te = Time::Piece->strptime('01-OCT-14', '%d-%b-%y');
my $fn = 'Input.csv';
open (my $fh, '<', $fn) or die "Could not open file '$fn': $!\n";
chomp(my $head = <$fh>);
say "$head,Ageing-NoOfDays";
while (my $line = <$fh>) {
chomp $line;
if ($csv->parse($line)) {
my $t = ($csv->fields())[2];
my $tp = Time::Piece->strptime($t, '%d-%b-%y.%T');
my $s = $te - $tp;
say "$line," . $s->days;
} else {
warn "Line could not be parsed: $line\n";
}
}
close($fh);