从HTML文件中提取多行属性值

时间:2016-04-26 17:23:13

标签: macos shell parsing sed

我正在MacOSX上编写一个shell脚本,我需要在其中提取隐藏字段的多行属性:

type="hidden" name="PaRes"    value="eJzteVdv7MaW7rsA/QfD97FhM3eThqwBc2Y3c3i5YJNs5hybv/5SksP2Pp45c2aAwTzcBgSRi1Wr
VtUK31dVb/+21dUPSzKMedv8+iP0M/jjD0kTtXHepL/+aFvcT/iP//b+ZmVDkjBmEs1D8v6mJuMY
pskPefzrj9f/CyEQgUEEAv34/nYjjWT8lNdj+vM3334e87RJ4qPJb2O9H0P9DL8Bv78eSocoC5vp
/S2MekrU3lECJkAYfQN+e3+rk0Fk3iEYQbHzBSfegC/BG/Bn19v88TQeNm55/E41ybQkrn677OJ9
361eKBfenON5wX99Az5avMXhlLzDIHQGUfj8A3T+BUN+gS5vwKf8rftQR9btfOi+nDHwDfhW8nas
xnAs1vMdh89vwB9vb8nWtU1ytDjm98fzG/CncV3YvIPf/Q7dh/TN8t7fprz+xigQ/QXDf0EPoz7l
b+MUTvP47r8Bvz29ReGyvJMkSZGGgZUr+f3vmOxnk7ckyt9B7DDq+P/Zi6zSdsinrP4w9a+CN+DD
FODTo+9v5uG+Y7Ah+eGIl2b89cdsmrpfAGBd159X5Od2SAH4mAQAEsDRID68/X9+/OqVxGLzaP+l
bnTYtE0ehVW+h9MRHGoyZW38wx+2/Z0ay/jQBAEGS/90qPopgtDmpw8JiEDYoRP4e6XfzOw/M8r3
xg5j+NOYhR+hD3yn6P3NSB7JR0QkP9iG+B9kBJOnyTj9V4b/fehvNfyuzwmrOXmX0BYBczbgOvqM
XIxwmemrZ2+BrZW//t7vq+Ub8Ie9v03my3PfrNBXQxGeMOBiSpqcCeMuCeeTWvKppluzz/CiZgxb
Cuptd3v2VgDkiVrVBSIVy01N5RVwL1FKP50sTfS0TavXF/ER3LFLoj90Wy0yFini3GA4JC0kIjrN
mQLcpYqqogvOJP1i62dWPwMVZGqbwk5P27svii7IsV9fNDt7fclxi7gkDVbYRYQ+ATqm1TzLB4BS
xQwJ0KuLLSG/zqRRplitpoKmYWXpP4JMIRmeLgSvXf1Wnx3GxLnXFwe4ogy0KwoQKXV9y6bdu48W
bTZANYj3uBfgsxit9pQ0Hapuj8aDpkdhY9P5apasZu+72kpCVgNCcipfX4ga0KV1V+QwlahTmMJr
f8WvwuSNtI6x9pVQl6omf/31m0j6zTNy8vzyhIeBBBNO4dcTnQxT/jhC+ihVqihylkXT5O6m5CpS
ZHr87aRGpWWflTlPrCBF6jZHMjQ1FayikiVPQjZLZSptgOLGFqROpZpzdLReX2g7y+4etYd89Qws
9qqS61frTeWMeqsCT994hnS/erQWAxltYFLqHY47/0kJUU1AEcNaKsXyry8fPelNlW3YecZ1VQQ2
sUY7e1epLxvITTUddhOUiqLsJ2YHngYqlQb5pUapVrRqui/JbSBmS6Qd2vSS4vKsCDwJDN2g82EO
DExsP97h0NWqCKTAkLenO080oatOASItsUfOPkxMKoV6jMVCKqOumsUiry8qw64q1x5SEfxd+pts
bHDc04G07FlAMt51RNdkdpzMzfVvaA3w5y0J8MfNyZ93Kp/3xZ/31x9XnN/ea/8/K5YUKA=="></input></td>
    </tr>

我需要在value="之后将名为PaRes的隐藏属性的值提取到"></input>

我尝试了sedcut命令,但我的结果并不准确。有什么建议?

1 个答案:

答案 0 :(得分:0)

我使用sedawk尝试了几种解决方案,但我无法使其正常工作。

以下是perl的解决方案:

<强> match.pl

#!/usr/bin/perl -w
my $filename = $ARGV[0];
my $content;
open(my $fh, '<', $filename) or die "cannot open file $filename";
{
  local $/;
  $content = <$fh>;
}
close($fh);

if ($content =~ m/value="(.*)"/s) {
    $result = $1;
    print($result);
}

从您的bash脚本中调用它,如下所示:

MATCH="$(perl /path/to/match.pl /path/to/file.html)"
echo "${MATCH}"

这不是理想的解决方案,但它有效,希望它有所帮助!