根据另一列的值填充CSV列

时间:2019-10-20 21:53:37

标签: regex bash csv awk

在Bash脚本中,我想根据另一列(第1列)的值填充一个当前为空的列(第5列)。

我认为我可以使用awk来获得所需的结果,但是我在语法上遇到了问题:

awk -F, '
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_f_[0-9]{3}[a-z]\.tif$/{$5="Text"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_[a-b]_[0-9]{1,3}[a-z]?\.tif$/{$5="Front matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_y_[0-9]{1,3}[a-z]?\.tif$/{$5="Back matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_z_1[a-z]?\.tif$/{$5="Back matter"}
' file.csv

我的输入看起来像这样:

File Name,Item Sequence,Visibility,Title
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0001_a_1.tif,1,discovery,Front Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0002_a_1a.tif,2,discovery,Front Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0003_b_000.tif,3,discovery,Front Board Inside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0009_b_003v.tif,9,discovery,Flyleaf 003v
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0010_f_001r.tif,10,discovery,f. 001r
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0060_y_001r.tif,60,discovery,Flyleaf 001r
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0070_y_999.tif,70,discovery,Back Board Inside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0071_z_1.tif,71,discovery,Back Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0072_z_1a.tif,72,discovery,Back Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0073_z_2.tif,73,discovery,Spine
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0074_z_3.tif,74,discovery,Fore edge

期望的结果应该看起来像下面我在上面分配的值(IIIF RangeFront matter,{{1})填充第五列(Text)的样子。 },然后根据第1列(Back matter)的值将其留空:

File Name

1 个答案:

答案 0 :(得分:2)

您可以使用~运算符将字符串与正则表达式模式进行匹配:

awk -F, 'BEGIN{OFS=","}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_f_[0-9]{3}[a-z]\.tif$/{$5="Text"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_[a-b]_[0-9]{1,3}[a-z]?\.tif$/{$5="Front matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_y_[0-9]{1,3}[a-z]?\.tif$/{$5="Back matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_z_1[a-z]?\.tif$/{$5="Back matter"}
1' file.csv