我有一个文本文件,看起来像
P0000X4SRN4H|PR|18.16129|-66.72835|728402000004797|Quebrada la Pastora|72.98479461669922|imgn19w062_13.img|1
P0000X4SRMQ5|PR|18.1619|-66.72427|728402000003808|Rio Cidra|335.3082275390625|imgn19w061_13.img|1
P0000X4SRMXN|PR|18.16106|-66.72144|728402000004007|Rio Cidra|143.83212280273438|imgn19w067_13.img|1
P0000X4SRMP5|PR|18.16221|-66.72382|728402000003318|Quebrada Muerto|451.31011962890625|imgn19w067_13.img|1
P0000X4SRMMC|PR|18.16377|-66.72496|728402000003318|Quebrada Muerto|102.55789947509766|imgn19w065_13.img|1
P0000X4SRMLA|PR|18.1592|-66.71959|728402000006409|Rio Cidra|254.85401916503906|imgn19w069_13.img|1
P0000X4SRMRC|PR|18.16403|-66.72557|728402000003318|Quebrada Muerto|284.13861083984375|imgn19w061_13.img|1
我想按第7列分隔数据,其中包含'imgn19w067_13.img'
,'imgn19w061_13.img'
这些值。我需要创建一个Pig脚本,该脚本创建一个文件夹名称'imgn19w061_13.img'
,例如这并将所有包含该值的数据(行)放在第7列中。非常感谢您的帮助。
注意:-如果文件夹名称看起来像这样 imgn19w061_13 而不是 imgn19w061_13.img
输出:
文件夹名称 imgn19w061_13
包含具有这些行的文本文件
P0000X4SRMRC|PR|18.16403|-66.72557|728402000003318|Quebrada Muerto|284.13861083984375|imgn19w061_13.img|1
P0000X4SRMRC|PR|18.16403|-66.72557|728402000003318|Quebrada Muerto|284.13861083984375|imgn19w061_13.img|1