Question

我有一个格式如下的文件：

#id|firstName|lastName|gender|birthday|creationDate|locationIP|browserUsed
933|Mahinda|Perera|male|19891203|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox

如您所见，分隔符为"|"，第五个字段为“birthday”。我想使用sed将"-"放在8位数字之间，以便得到如下结果：

| 1989年12月3日|

我的尝试是：sed 's/..../&-/;s/:$//' | sed 's/......./&-/;s/:$//'

但是这个命令会对我文件的每一行的开头进行更改。我想仅在第五个字段中进行更改。这可能与sed？

请注意，这是一项功课。

非常感谢。

Answer 1

...原始

$ cat data
#id|firstName|lastName|gender|birthday|creationDate|locationIP|browserUsed
933|Mahinda|Perera|male|19891203|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox

...转化

$ cat data | sed -r 's/^(([^|]+\|){4})([0-9]{4})([0-9]{2})([0-9]{2})(.+)$/\1\3-\4-\5\6/'
#id|firstName|lastName|gender|birthday|creationDate|locationIP|browserUsed
933|Mahinda|Perera|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox

以下是帮助您了解的一些背景信息......

^                               # Start of Line
 (([^\|]+\|){4})                # Grab the first 4 fields in \1 (note \2 is not useful for us here)
 ([0-9]{4})([0-9]{2})([0-9]{2}) # Split up the field we want to modify in \3, \4 and \5
 (.+)                           # Grab whatever is left in \6
$                               # End of Line

Answer 2

虽然使用sed确实可以实现你想要的东西，但使用awk几乎肯定会更好。以下是使用BSD awk，gawk和mawk测试的：

awk -F'|' '
  BEGIN {OFS=FS}
  NF==1 {print; next}
  {sub(/^....../, "&-", $5);
   sub(/^..../, "&-", $5);
   print;
  } '

您可能希望针对第5列中的意外值使上述内容更加健壮。

如果你真的需要使用sed，一种方法是使用[^|]*;例如，如果你的sed支持扩展的正则表达式：

sed -r 's/^(([^|]*\|){4})(....)(..)(..)/\1\3-\4-\5/'

请注意，此处未使用\2。

（在Mac上，使用-E代替-r。）

如何使用sed修改具有竖线分隔值

2 个答案: