我以这种方式存档每小时数据
2015-09-03 02:00:00 to 2015-09-03 02:59:59|ABC|673
2015-09-03 02:00:00 to 2015-09-03 02:59:59|AABC|52
2015-09-03 02:00:00 to 2015-09-03 02:59:59|ABCD|787
2015-09-03 02:00:00 to 2015-09-03 02:59:59|ADFGE|35
2015-09-03 02:00:00 to 2015-09-03 02:59:59|AGER|41
2015-09-03 02:00:00 to 2015-09-03 02:59:59|ETECFF|1384
2015-09-03 02:00:00 to 2015-09-03 02:59:59|TRIFD|38
2015-09-03 02:00:00 to 2015-09-03 02:59:59|CVGFFHG|166
2015-09-03 03:00:00 to 2015-09-03 03:59:59|FJREER|36
2015-09-03 03:00:00 to 2015-09-03 03:59:59|DFSD|31
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ASBF|38
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ABC|36
2015-09-03 03:00:00 to 2015-09-03 03:59:59|AABC|35
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ABCD|33
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ADFGE|39
2015-09-03 03:00:00 to 2015-09-03 03:59:59|AGER|33
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ETECFF|537
2015-09-03 03:00:00 to 2015-09-03 03:59:59|TRIFD|620635
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ABC|37
2015-09-03 03:00:00 to 2015-09-03 03:59:59|AABC|702
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ABCD|319
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ADFGE|33
2015-09-03 03:00:00 to 2015-09-03 03:59:59|AGER|306
2015-09-03 03:00:00 to 2015-09-03 03:59:59|ETECFF|34
2015-09-03 03:00:00 to 2015-09-03 03:59:59|TRIFD|44
2015-09-03 03:00:00 to 2015-09-03 03:59:59|CVGFFHG|599
2015-09-03 03:00:00 to 2015-09-03 03:59:59|FJREER|30
2015-09-03 03:00:00 to 2015-09-03 03:59:59|DFSD|82
我想转置数据,
1. Column 1 should go in as column header
2. Column 2 should go in row header
3. Column 3 is data
4. Any absence of data should be represented as 0 (Zero)
以下是转置数据应如何显示
|2015-09-03 02:00:00 to 2015-09-03 02:59:59|2015-09-03 03:00:00 to 2015-09-03 03:59:59
AABC|52|737
ABC|0|73
ABCD|787|352
ADFGE|35|72
AGER|41|339
ASBF|0|38
CVGFFHG|166|599
DFSD|0|113
ETECFF|1384|571
FJREER|0|66
TRIFD|38|620679
我尝试过使用sed,但这不起作用。我还不是很好,还没达到高级水平,所以需要帮助
答案 0 :(得分:1)
这是awk的解决方案。它保存在2D数组values
中
具有相同关键字key
和相同标题列索引i
的所有行。
在END
,所有这些都打印在每个键和列上。
数组cols
用于检测标题列的更改。
hdrs
用于保持标题按正确的顺序输出。
keys
仅用于保留所有关键字的列表。
awk -F'|' '
{ hdr = $1; key = $2; val = $3;
if(cols[hdr]==0){
cols[hdr] = ++column;
hdrs[column] = hdr;
}
i = cols[hdr]
keys[key] = 1
values[i, key] += val
}
END{
for(i = 1;i<=column;i++)
printf "|%s", hdrs[i]
printf "\n"
n = asorti(keys,sort)
for(j = 1;j<=n;j++){
key = sort[j]
printf "%s",key
for(i = 1;i<=column;i++)
printf "|%s", values[i, key]+0
printf "\n"
}
}'
答案 1 :(得分:0)
我认为在awk中你可以创建一个索引类型为string的数组,也就是以列1为键的字典。
该数组的每个元素都应该填充另一个带索引字符串的数组:第2列作为键。
然后处理每一行,在必要时创建新的数组元素,并将第3列添加到值中。
有关awk中语法的帮助:
http://www.thegeekstuff.com/2010/03/awk-arrays-explained-with-5-practical-examples/
请看第5节中的示例1最终解决方案的简单程度。
答案 2 :(得分:0)
另一个awk
awk -F '|' '
{
Data[ $1, $2] = $3 + 1
if( match( Headers, "(^\||)" $1 "(|\|$)" ) == 0 ) Headers = Headers $1 "|"
if( match( Records, "(^\||)" $2 "(|\|$)" ) == 0 ) Records = Records $2 "|"
}
END {
cHeader = split( Headers, aHeader, "|" )
cRecord = split( Records, aRecord, "|" )
sub( /\|$/, "", Headers
print "|" Headers
for( iRecord = 1; iRecord <= cRecord; iRecord++) {
printf "%s", aRecord[ 1]
for( iHeader = 2; iHeader <= cHeader; iHeader++ ) {
ThisData = Data[ aHeader[ iHeader], aRecord[ iRecord] ]
printf "|%s", --ThisData
}
print
}
}
' YourFile
$3 + 1
及更高版本--ThisData
强制执行0