我有一个包含由制表符分隔的4列的文件。在最后一列中,引号之间有时会出现尾随标签。 这是trim leading and trailing spaces from a string in awk的类似问题。这是一个例子:
INNER JOIN MyTable3 t3
ON (Direction = 1 AND t3.TypeId = t1.TypeId) OR
(Direction <> 1 AND t3.TypeId = t2.TypeId)
这是我到目前为止所提出的:
SELECT
Root,
Direction,
Type
FROM MyTable t
INNER JOIN MyTable1 t1
ON t1.Id = t.RootId
INNER JOIN MyTable2 t2
ON t2.Id = t.RootId
INNER JOIN MyTable3 t3
ON (Direction = 1 AND t3.TypeId = t1.TypeId) OR
(Direction <> 1 AND t3.TypeId = t2.TypeId);
问题在于它会破坏我的格式,这意味着每列都没有引号。好处是列之间的标签仍然存在:
col1 col2 col3 col4
"12" "d" "5" "this is great"
"13" "d" "6" "this is great<tab>"
"14" "d" "7" "this is great<tab><tab>"
"15" "d" "8" "this is great"
"16" "d" "9" "this is great<tab>"
我做错了什么?
答案 0 :(得分:2)
你需要告诉awk输出字段分隔符(OFS)也是一个引用。例如:
awk -v OFS='"' -F '"' 'NF == 9 {
if ($8 ~ /\t$/) {
gsub(/[\t]+$/,"",$8)
}
}
1' input.txt
输出:
col1 col2 col3 col4
"12" "d" "5" "this is great"
"13" "d" "6" "this is great"
"14" "d" "7" "this is great"
"15" "d" "8" "this is great"
"16" "d" "9" "this is great"