我有一个html /文本文件,其中包含可下载的视频网址,但我需要将文件剪切为链接。
例如,它以url240=
或url360=
或480等开头,网址为LinkOfUrl
,最后以&jpg...
我希望获得url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA
或这是我的观点 - >> https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA
我该怎么办?
我的目标是抓取视频链接并从终端下载而不是使用浏览器或不使用手动输入。我打算编写一个shell脚本来自动完成,所以我需要将链接指定为变量,之后我将使用axel $variable
。 (如果有人想知道我是怎么得到的话,我用wget来获取html /文本文件。)
Here是我原来的html /文本文件
我有一个小文件:
</param> <param name="flashvars" value="uid=313752528&vid=171193750&oid=313752528&host=https://cs542402.vk.me/&vtag=68ec547387&ltag=l_f4d9714c&vkid=171193750&md_title=Hi.Ki3+ep08&md_author=Kadem+%26%23199%3Belik&author_href=/id313752528&hd=3&no_flv=1&hd_def=0&dbg_on=0&t=0&duration=1399&thumb=https://pp.vk.me/c621922/v621922528/26d5d/buRyDSImp1Y.jpg&hash=17cc43bba064a6bd4c887b2b336e28a1&hash2=8a5ce8fb95dac3b7&angle=0&img_angle=0&repeat=0&show_ads=0&show_ads_postroll=0&legal_owner=0&eid1=0&slot=0&g=0&a=0&puid34=0&water_mark=&can_rotate=1&no_adfox=1&ads_preview=0&puid4=0&url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url360=https://cs542402.vk.me/6/u313752528/videos/68ec547387.360.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url480=https://cs542402.vk.me/6/u313752528/videos/68ec547387.480.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url720=https://cs542402.vk.me/6/u313752528/videos/68ec547387.720.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&jpg=https://pp.vk.me/c621922/v621922528/26d5c/Hqe6Th45LmA.jpg&timeline_thumbs=1&timeline_thumbs_jpg=https://pp.vk.me/c623120/v623120528/403ab/W0o1WeSlf5o.jpg,https://pp.vk.me/c622717/v622717528/3bcdc/enzNOAL2PJw.jpg,https://pp.vk.me/c625626/v625626528/3edb9/y44h0j9UBT4.jpg&timeline_thumbs_per_row=10&timeline_thumbs_per_image=100&timeline_thumbs_total=279&timeline_thumb_width=133&timeline_thumb_height=75&ip_subm=1&proxy=psv4&https=1&video_ext=1&is_yandex=0&apm;&is_vk=1&is_ext=1&referrer=&lang_add=Add+to+My+Videos&lang_added=Video+added+to+My+Videos&lang_share=Share&lang_like=Like&lang_volume_on=Unmute&lang_volume_off=Mute&lang_volume=Volume&lang_hdsd=Change+Video+Quality&lang_open_popup=Expand&lang_fullscreen=Full+Screen&lang_window=Minimize&lang_rotate=Rotate&lang_ads_link=Advertiser%27s+Site&lang_ads=Ads&lang_ads_skip=Skip+ad&lang_next=Next+video&lang_replay=Replay&lang_next_cancel=Cancel&lang_ads_skip_time=Skip+ads+in+%7Btime%7D&lang_report_problem=Report+a+problem..&video_play_hd=Watch+in+HD&video_stop_loading=Stop+Download&video_player_version=VK+Video+Player&goto_orig_video=Go+to+Video&video_get_video_code=Copy+video+code&video_load_error=The+video+has+not+uploaded+yet+or+the+server+is+not+available&video_get_current_url=Copy+frame+link"></param> <param name="wmode" value="opaque"></param> <embed id="flash_video_obj" align="top" src="/swf/video.swf?94" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="100%" height="100%" wmode="opaque" flashvars=uid=313752528&vid=171193750&oid=313752528&host=https://cs542402.vk.me/&vtag=68ec547387&ltag=l_f4d9714c&vkid=171193750&md_title=Hi.Ki3+ep08&md_author=Kadem+%26%23199%3Belik&author_href=/id313752528&hd=3&no_flv=1&hd_def=0&dbg_on=0&t=0&duration=1399&thumb=https://pp.vk.me/c621922/v621922528/26d5d/buRyDSImp1Y.jpg&hash=17cc43bba064a6bd4c887b2b336e28a1&hash2=8a5ce8fb95dac3b7&angle=0&img_angle=0&repeat=0&show_ads=0&show_ads_postroll=0&legal_owner=0&eid1=0&slot=0&g=0&a=0&puid34=0&water_mark=&can_rotate=1&no_adfox=1&ads_preview=0&puid4=0&url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url360=https://cs542402.vk.me/6/u313752528/videos/68ec547387.360.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url480=https://cs542402.vk.me/6/u313752528/videos/68ec547387.480.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&url720=https://cs542402.vk.me/6/u313752528/videos/68ec547387.720.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA&jpg=https://pp.vk.me/c621922/v621922528/26d5c/Hqe6Th45LmA.jpg&timeline_thumbs=1&timeline_thumbs_jpg=https://pp.vk.me/c623120/v623120528/403ab/W0o1WeSlf5o.jpg,https://pp.vk.me/c622717/v622717528/3bcdc/enzNOAL2PJw.jpg,https://pp.vk.me/c625626/v625626528/3edb9/y44h0j9UBT4.jpg&timeline_thumbs_per_row=10&timeline_thumbs_per_image=100&timeline_thumbs_total=279&timeline_thumb_width=133&timeline_thumb_height=75&ip_subm=1&proxy=psv4&https=1&video_ext=1&is_yandex=0&apm;&is_vk=1&is_ext=1&referrer=&lang_add=Add+to+My+Videos&lang_added=Video+added+to+My+Videos&lang_share=Share&lang_like=Like&lang_volume_on=Unmute&lang_volume_off=Mute&lang_volume=Volume&lang_hdsd=Change+Video+Quality&lang_open_popup=Expand&lang_fullscreen=Full+Screen&lang_window=Minimize&lang_rotate=Rotate&lang_ads_link=Advertiser%27s+Site&lang_ads=Ads&lang_ads_skip=Skip+ad&lang_next=Next+video&lang_replay=Replay&lang_next_cancel=Cancel&lang_ads_skip_time=Skip+ads+in+%7Btime%7D&lang_report_problem=Report+a+problem..&video_play_hd=Watch+in+HD&video_stop_loading=Stop+Download&video_player_version=VK+Video+Player&goto_orig_video=Go+to+Video&video_get_video_code=Copy+video+code&video_load_error=The+video+has+not+uploaded+yet+or+the+server+is+not+available&video_get_current_url=Copy+frame+link></embed> </object>
答案 0 :(得分:0)
这应该给你一些想法......
sed 's/&/\n/g' | grep url240 | cut -d= -f2
通过&#34;&amp;&#34;加入的单独参数,grep所需的行,获得=符号的右侧。
如果您对内容进行了=签名,则只需根据变量名称长度进行剪切。例如
var="url240"; sed 's/&/\n/g' | grep $var | cut -c $((${#var}+2))-
答案 1 :(得分:0)
非常感谢karakfa因为你的想法。!! 我用这个命令解决了我的问题:
sed 's/&/\n/g' MyFile | grep url240 | sed -n '1p'
这个第一个sed命令用空格字符替换了&
所以在那之后用grep我得到3行(最后一行是这么长)。之后使用sed
我从输出中剪切了第一行,现在效果很好。但我的输出看起来像那样
url240=https://cs542402.vk.me/6/u313752528/videos/68ec547387.240.mp4?extra=JVSeMzU2-msXVuseackTFCgnVNEPOnboxMdXN3ZGH1l91djR7bk9DzaxyKFGJ2SXg39BcZxJM6tBip0ui3dDEkULDSbkuzpaOA
所以,我添加了tail -c +8
,我的最终命令看起来像这样
sed 's/&/\n/g' MyFile | grep url240 | sed -n '1p' | tail -c +8