如何使用python从电报中获取图像URL

时间:2019-11-12 15:23:03

标签: python telegram

是否可以使用Python从http-link到电报帖子获取直接图像URL?

我直接链接到电报帖子,例如:https://t.me/tele2slack/223

我可以使用chrome inspector找到图像URL ...,因为我的链接图像URL为:

image

https://cdn4.telesco.pe/file/oxrpfWsqyBeFI3KIxPqBf-5A1k_OEiueCdwpuhR0oWtM7_88zpYi7kRsADHYobpByICSfImn_CffaxWr2nC6E49BSFchpKRKO5bkNPsFmefhsjdLstZwtHaeZGqHkqWFcGbtujPcmigwJkl7gH7tjHJrqlpmhZmGS7QnacF8PNxpocVMqaQXRxLW7kAwm6lVxLYo6AJqNb8bdZ5RXJgd6mQG0v5QINvTwtJNdioEWDAjtsufsxHVgzdUK1yBn1M3cjmhjfv8o4uMyi0bhsdFV_q21e0Sqj-QvUi-99JCPSHNVlLBfoWQEtSCeErPE45UrlqbnELYOznvLq_CeE6BcQ.jpg

有什么方法可以使用python自动执行此过程吗?

我尝试了GET请求,但很遗憾没有得到有用的信息:

response = requests.get("https://t.me/tele2slack/223")

响应:

    <!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Telegram: Contact @tele2slack</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">

<meta property="og:title" content="Tele2slack_dev">
<meta property="og:image" content="https://telegram.org/img/t_logo.png">
<meta property="og:site_name" content="Telegram">
<meta property="og:description" content="??#TSN #отчетности #сша
Tyson Foods Q4 Earnings: 
-Q4 Adj EPS &#036;1.21 (est &#036;1.25)
-Q4 Revenue &#036;10.88 Bln (esat &#036;11.0 Bln)">

<meta property="twitter:title" content="Tele2slack_dev">
<meta property="twitter:image" content="https://telegram.org/img/t_logo.png">
<meta property="twitter:site" content="@Telegram">

<meta property="al:ios:app_store_id" content="686449807">
<meta property="al:ios:app_name" content="Telegram Messenger">
<meta property="al:ios:url" content="tg://resolve?domain=tele2slack&amp;post=224">

<meta property="al:android:url" content="tg://resolve?domain=tele2slack&amp;post=224">
<meta property="al:android:app_name" content="Telegram">
<meta property="al:android:package" content="org.telegram.messenger">

<meta name="twitter:card" content="summary">
<meta name="twitter:site" content="@Telegram">
<meta name="twitter:description" content="??#TSN #отчетности #сша
Tyson Foods Q4 Earnings: 
-Q4 Adj EPS &#036;1.21 (est &#036;1.25)
-Q4 Revenue &#036;10.88 Bln (esat &#036;11.0 Bln)
">
<meta name="twitter:app:name:iphone" content="Telegram Messenger">
<meta name="twitter:app:id:iphone" content="686449807">
<meta name="twitter:app:url:iphone" content="tg://resolve?domain=tele2slack&amp;post=224">
<meta name="twitter:app:name:ipad" content="Telegram Messenger">
<meta name="twitter:app:id:ipad" content="686449807">
<meta name="twitter:app:url:ipad" content="tg://resolve?domain=tele2slack&amp;post=224">
<meta name="twitter:app:name:googleplay" content="Telegram">
<meta name="twitter:app:id:googleplay" content="org.telegram.messenger">
<meta name="twitter:app:url:googleplay" content="https://t.me/tele2slack/224">

<meta name="apple-itunes-app" content="app-id=686449807, app-argument: tg://resolve?domain=tele2slack&post=224">
    <link rel="shortcut icon" href="//telegram.org/favicon.ico?3" type="image/x-icon" />
    <link href="https://fonts.googleapis.com/css?family=Roboto:400,700" rel="stylesheet" type="text/css">
    <!--link href="/css/myriad.css" rel="stylesheet"-->
    <link href="//telegram.org/css/bootstrap.min.css?3" rel="stylesheet">
    <link href="//telegram.org/css/telegram.css?177" rel="stylesheet" media="screen">
  </head>
  <body>

    <div class="tgme_page_wrap">
      <div class="tgme_head_wrap">
        <div class="tgme_head">
          <a href="//telegram.org/" class="tgme_head_brand">
            <i class="tgme_logo"></i>
          </a>
        </div>
      </div>
      <a class="tgme_head_dl_button" href="//telegram.org/dl?tme=6dae9c11480edfa67e_2093069837989679044">
        Don't have <strong>Telegram</strong> yet? Try it now!<i class="tgme_icon_arrow"></i>
      </a>
      <div class="tgme_page tgme_page_post">
        <div class="tgme_page_widget" id="widget">
  <script async src="https://telegram.org/js/telegram-widget.js?7" data-telegram-post="tele2slack/224" data-width="100%"></script>
</div>
<div class="tgme_page_widget_actions" id="widget_actions">
  <div class="tgme_page_widget_actions_cont">
    <div class="tgme_page_widget_action_right">
      <div class="tgme_page_context_btn"><a class="tgme_action_button_new" href="/s/tele2slack/224"><span class="tgme_action_button_label">Context</span></a></div>
    </div>
    <div class="tgme_page_widget_action_left">
      <div class="tgme_page_embed_btn">
        <a class="tgme_action_button_new" onclick="return toggleEmbed();"><span class="tgme_action_button_label">Embed</span></a>
      </div>
    </div>
    <div class="tgme_page_widget_action">
      <a class="tgme_action_button_new" href="tg://resolve?domain=tele2slack&post=224">View In Channel</a>
    </div>
    <div class="tgme_page_embed_action">
      <textarea class="tgme_page_embed_code" rows="3" id="embed_code_field" readonly>&lt;script async src=&quot;https://telegram.org/js/telegram-widget.js?7&quot; data-telegram-post=&quot;tele2slack/224&quot; data-width=&quot;100%&quot;&gt;&lt;/script&gt;</textarea>
      <div class="tgme_page_copy_action">
        <a class="tgme_action_button_new" onclick="return copyEmbedCode();">Copy</a>
      </div>
    </div>
  </div>
</div>
      </div>
    </div>

    <div id="tgme_frame_cont"></div>

    <script type="text/javascript">

var protoUrl = "tg:\/\/resolve?domain=tele2slack&post=224";
if (false) {
  var iframeContEl = document.getElementById('tgme_frame_cont') || document.body;
  var iframeEl = document.createElement('iframe');
  iframeContEl.appendChild(iframeEl);
  var pageHidden = false;
  window.addEventListener('pagehide', function () {
    pageHidden = true;
  }, false);
  window.addEventListener('blur', function () {
    pageHidden = true;
  }, false);
  if (iframeEl !== null) {
    iframeEl.src = protoUrl;
  }
  !false && setTimeout(function() {
    if (!pageHidden) {
      window.location = protoUrl;
    }
  }, 2000);
}
else if (protoUrl) {
  setTimeout(function() {
    window.location = protoUrl;
  }, 100);
}


    </script>
    <script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-45099287-3', 'auto', {'sampleRate': 5});
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');function toggleEmbed() {
  var widget_actions = document.getElementById('widget_actions');
  if (widget_actions.classList.contains('embed_opened')) {
    widget_actions.classList.remove('embed_opened');
  } else {
    widget_actions.classList.add('embed_opened');
    if (!document.body.classList.contains('fixed_actions')) {
      window.scrollTo(0, document.body.offsetHeight);
    }
    selectEmbedCode();
  }
  checkActionsPosition();
  return false;
}
function selectEmbedCode() {
  var field = document.getElementById('embed_code_field');
  field.focus();
  field.setSelectionRange(0, field.value.length);
}
function copyEmbedCode() {
  selectEmbedCode();
  document.execCommand('copy');
  return false;
}
function checkActionsPosition() {
  var widget = document.getElementById('widget');
  var widget_actions = document.getElementById('widget_actions');
  var widget_rect = widget.getBoundingClientRect();
  var actions_bottom = widget_rect.bottom + widget_actions.offsetHeight - 1;
  var client_bottom = window.innerHeight || html.clientHeight;
  if (actions_bottom > client_bottom) {
    widget.style.marginBottom = widget_actions.offsetHeight + 'px';
    document.body.classList.add('fixed_actions');
  } else {
    widget.style.marginBottom = '';
    document.body.classList.remove('fixed_actions');
  }
}
function postMessageHandler(event) {
  try { var data = JSON.parse(event.data); }
  catch(e) { var data = {}; }
  if (data.event == 'resize') {
    setTimeout(checkActionsPosition, 50);
  }
}
window.addEventListener('resize', checkActionsPosition);
window.addEventListener('scroll', checkActionsPosition);
window.addEventListener('message', postMessageHandler);
</script>
  </body>
</html>
<!-- page generated in 10.93ms -->

1 个答案:

答案 0 :(得分:0)

通常,我使用硒来进行Web剪贴和其他自动化。

检查此解决方案,也许可以帮忙:

import urllib
from selenium import webdriver

driver = webdriver.chrome()
driver.get('https://www.google.com/')

# get the image from google website
img = driver.find_element_by_xpath('//*[@id="hplogo"]/img')
src = img.get_attribute('src')

# download the image
urllib.urlretrieve(src, "google_logo.png")

driver.close()

要获取xpath源,请右键单击鼠标>检查元素>右键单击HTML元素>单击复制XPATH