用开始剪切shell的输出并完成特定字符

时间:2015-07-18 17:16:59

标签: shell cut

我有一个html /文本文件,其中包含可下载的视频网址,但我需要将文件剪切为链接。

例如,它以url240=url360=或480等开头,网址为LinkOfUrl,最后以&jpg...

结束

我希望获得url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA这是我的观点 - >> https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA

我该怎么办?

我的目标是抓取视频链接并从终端下载而不是使用浏览器或不使用手动输入。我打算编写一个shell脚本来自动完成,所以我需要将链接指定为变量,之后我将使用axel $variable。 (如果有人想知道我是怎么得到的话,我用wget来获取html /文本文件。)

Here是我原来的html /文本文件

我有一个小文件:

</param> <param name="flashvars" value="uid=313752528&amp;vid=171193750&amp;oid=313752528&amp;host=https://cs542402.vk.me/&amp;vtag=68ec547387&amp;ltag=l_f4d9714c&amp;vkid=171193750&amp;md_title=Hi.Ki3+ep08&amp;md_author=Kadem+%26%23199%3Belik&amp;author_href=/id313752528&amp;hd=3&amp;no_flv=1&amp;hd_def=0&amp;dbg_on=0&amp;t=0&amp;duration=1399&amp;thumb=https://pp.vk.me/c621922/v621922528/26d5d/buRyDSImp1Y.jpg&amp;hash=17cc43bba064a6bd4c887b2b336e28a1&amp;hash2=8a5ce8fb95dac3b7&amp;angle=0&amp;img_angle=0&amp;repeat=0&amp;show_ads=0&amp;show_ads_postroll=0&amp;legal_owner=0&amp;eid1=0&amp;slot=0&amp;g=0&amp;a=0&amp;puid34=0&amp;water_mark=&amp;can_rotate=1&amp;no_adfox=1&amp;ads_preview=0&amp;puid4=0&amp;url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url360=https://cs542402.vk.me/6/u313752528/videos/68ec547387.360.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url480=https://cs542402.vk.me/6/u313752528/videos/68ec547387.480.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url720=https://cs542402.vk.me/6/u313752528/videos/68ec547387.720.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;jpg=https://pp.vk.me/c621922/v621922528/26d5c/Hqe6Th45LmA.jpg&amp;timeline_thumbs=1&amp;timeline_thumbs_jpg=https://pp.vk.me/c623120/v623120528/403ab/W0o1WeSlf5o.jpg,https://pp.vk.me/c622717/v622717528/3bcdc/enzNOAL2PJw.jpg,https://pp.vk.me/c625626/v625626528/3edb9/y44h0j9UBT4.jpg&amp;timeline_thumbs_per_row=10&amp;timeline_thumbs_per_image=100&amp;timeline_thumbs_total=279&amp;timeline_thumb_width=133&amp;timeline_thumb_height=75&amp;ip_subm=1&amp;proxy=psv4&amp;https=1&amp;video_ext=1&amp;is_yandex=0&apm;&amp;is_vk=1&amp;is_ext=1&amp;referrer=&amp;lang_add=Add+to+My+Videos&amp;lang_added=Video+added+to+My+Videos&amp;lang_share=Share&amp;lang_like=Like&amp;lang_volume_on=Unmute&amp;lang_volume_off=Mute&amp;lang_volume=Volume&amp;lang_hdsd=Change+Video+Quality&amp;lang_open_popup=Expand&amp;lang_fullscreen=Full+Screen&amp;lang_window=Minimize&amp;lang_rotate=Rotate&amp;lang_ads_link=Advertiser%27s+Site&amp;lang_ads=Ads&amp;lang_ads_skip=Skip+ad&amp;lang_next=Next+video&amp;lang_replay=Replay&amp;lang_next_cancel=Cancel&amp;lang_ads_skip_time=Skip+ads+in+%7Btime%7D&amp;lang_report_problem=Report+a+problem..&amp;video_play_hd=Watch+in+HD&amp;video_stop_loading=Stop+Download&amp;video_player_version=VK+Video+Player&amp;goto_orig_video=Go+to+Video&amp;video_get_video_code=Copy+video+code&amp;video_load_error=The+video+has+not+uploaded+yet+or+the+server+is+not+available&amp;video_get_current_url=Copy+frame+link"></param> <param name="wmode" value="opaque"></param> <embed id="flash_video_obj" align="top" src="/swf/video.swf?94" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="100%" height="100%" wmode="opaque" flashvars=uid=313752528&amp;vid=171193750&amp;oid=313752528&amp;host=https://cs542402.vk.me/&amp;vtag=68ec547387&amp;ltag=l_f4d9714c&amp;vkid=171193750&amp;md_title=Hi.Ki3+ep08&amp;md_author=Kadem+%26%23199%3Belik&amp;author_href=/id313752528&amp;hd=3&amp;no_flv=1&amp;hd_def=0&amp;dbg_on=0&amp;t=0&amp;duration=1399&amp;thumb=https://pp.vk.me/c621922/v621922528/26d5d/buRyDSImp1Y.jpg&amp;hash=17cc43bba064a6bd4c887b2b336e28a1&amp;hash2=8a5ce8fb95dac3b7&amp;angle=0&amp;img_angle=0&amp;repeat=0&amp;show_ads=0&amp;show_ads_postroll=0&amp;legal_owner=0&amp;eid1=0&amp;slot=0&amp;g=0&amp;a=0&amp;puid34=0&amp;water_mark=&amp;can_rotate=1&amp;no_adfox=1&amp;ads_preview=0&amp;puid4=0&amp;url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url360=https://cs542402.vk.me/6/u313752528/videos/68ec547387.360.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url480=https://cs542402.vk.me/6/u313752528/videos/68ec547387.480.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;url720=https://cs542402.vk.me/6/u313752528/videos/68ec547387.720.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&amp;jpg=https://pp.vk.me/c621922/v621922528/26d5c/Hqe6Th45LmA.jpg&amp;timeline_thumbs=1&amp;timeline_thumbs_jpg=https://pp.vk.me/c623120/v623120528/403ab/W0o1WeSlf5o.jpg,https://pp.vk.me/c622717/v622717528/3bcdc/enzNOAL2PJw.jpg,https://pp.vk.me/c625626/v625626528/3edb9/y44h0j9UBT4.jpg&amp;timeline_thumbs_per_row=10&amp;timeline_thumbs_per_image=100&amp;timeline_thumbs_total=279&amp;timeline_thumb_width=133&amp;timeline_thumb_height=75&amp;ip_subm=1&amp;proxy=psv4&amp;https=1&amp;video_ext=1&amp;is_yandex=0&apm;&amp;is_vk=1&amp;is_ext=1&amp;referrer=&amp;lang_add=Add+to+My+Videos&amp;lang_added=Video+added+to+My+Videos&amp;lang_share=Share&amp;lang_like=Like&amp;lang_volume_on=Unmute&amp;lang_volume_off=Mute&amp;lang_volume=Volume&amp;lang_hdsd=Change+Video+Quality&amp;lang_open_popup=Expand&amp;lang_fullscreen=Full+Screen&amp;lang_window=Minimize&amp;lang_rotate=Rotate&amp;lang_ads_link=Advertiser%27s+Site&amp;lang_ads=Ads&amp;lang_ads_skip=Skip+ad&amp;lang_next=Next+video&amp;lang_replay=Replay&amp;lang_next_cancel=Cancel&amp;lang_ads_skip_time=Skip+ads+in+%7Btime%7D&amp;lang_report_problem=Report+a+problem..&amp;video_play_hd=Watch+in+HD&amp;video_stop_loading=Stop+Download&amp;video_player_version=VK+Video+Player&amp;goto_orig_video=Go+to+Video&amp;video_get_video_code=Copy+video+code&amp;video_load_error=The+video+has+not+uploaded+yet+or+the+server+is+not+available&amp;video_get_current_url=Copy+frame+link></embed> </object>

2 个答案:

答案 0 :(得分:0)

这应该给你一些想法......

sed 's/&amp;/\n/g' | grep url240 | cut -d= -f2

通过&#34;&amp;&#34;加入的单独参数,grep所需的行,获得=符号的右侧。

如果您对内容进行了=签名,则只需根据变量名称长度进行剪切。例如

var="url240"; sed 's/&amp;/\n/g' | grep $var | cut -c $((${#var}+2))-

答案 1 :(得分:0)

非常感谢karakfa因为你的想法。!! 我用这个命令解决了我的问题:

sed 's/&amp;/\n/g' MyFile  | grep url240 | sed -n '1p'

这个第一个sed命令用空格字符替换了&amp;所以在那之后用grep我得到3行(最后一行是这么长)。之后使用sed我从输出中剪切了第一行,现在效果很好。但我的输出看起来像那样

url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA

所以,我添加了tail -c +8,我的最终命令看起来像这样

sed 's/&amp;/\n/g' MyFile  | grep url240 | sed -n '1p' | tail -c +8