PHP Curl从aws返回404,但可以从其他主机甚至我的机器上运行

时间:2016-06-06 16:00:10

标签: php ubuntu amazon-web-services curl ssh

一直在敲我的脑袋一段时间。我已经编写了PHP代码来搜索网站https:paytm.com,它可以从我之前使用的其他托管网站开始,但不是来自AWS。

我的机器运行带有php7的ubuntu,已安装并运行curl。使用curl的搜索适用于其他网站,curl https也有效。相同的代码在其他主机上运行得非常出色(通过ssh和前端),它甚至可以在浏览器中运行。

我尝试访问的网址是curl -v 'https://search.paytm.com/search/?page_count=1&items_per_page=10&quality=high&curated=1&cat_tree=1&from=organic&channel=web&version=2&userQuery=iphone'

我检查过的事情:

  1. 证书是最新的
  2. 该代码在我的计算机和其他托管服务提供商
  3. 上正常运行
  4. AWS设置为ubuntu 12.04,运行php
  5. 重新设置机器
  6. 从浏览器运行脚本
  7. 在我的aws设置中检查google,twitter,linkedIn上的https curl,它们运行正常。
  8. 来自我的AWS设置的结果响应是:

    
        ubuntu@ip-172-31-20-200:/usr/local/share$ curl -v https://search.paytm.com/search/?page_count=1&items_per_page=10&quality=high&curated=1&cat_tree=1&from=organic&channel=web&version=2&userQuery=iphone
        [1] 12595
        [2] 12596
        [3] 12597
        [4] 12598
        [5] 12599
        [6] 12600
        [7] 12601
        [8] 12602
        ubuntu@ip-172-31-20-200:/usr/local/share$ * Hostname was NOT found in DNS cache
        *   Trying 96.6.72.42...
        * Connected to search.paytm.com (96.6.72.42) port 443 (#0)
        * successfully set certificate verify locations:
        *   CAfile: none
          CApath: /etc/ssl/certs
        * SSLv3, TLS handshake, Client hello (1):
        * SSLv3, TLS handshake, Server hello (2):
        * SSLv3, TLS handshake, CERT (11):
        * SSLv3, TLS handshake, Server key exchange (12):
        * SSLv3, TLS handshake, Server finished (14):
        * SSLv3, TLS handshake, Client key exchange (16):
        * SSLv3, TLS change cipher, Client hello (1):
        * SSLv3, TLS handshake, Finished (20):
        * SSLv3, TLS change cipher, Client hello (1):
        * SSLv3, TLS handshake, Finished (20):
        * SSL connection using ECDHE-RSA-AES256-GCM-SHA384
        * Server certificate:
        *    subject: C=IN; ST=Uttar Pradesh; L=Noida; O=One 97 Communications Limited; CN=secure.paytm.in
        *    start date: 2015-10-29 00:00:00 GMT
        *    expire date: 2016-10-28 23:59:59 GMT
        *    subjectAltName: search.paytm.com matched
        *    issuer: C=US; O=GeoTrust Inc.; CN=GeoTrust SSL CA - G3
        *    SSL certificate verify ok.
        > GET /search/?page_count=1 HTTP/1.1
        > User-Agent: curl/7.35.0
        > Host: search.paytm.com
        > Accept: */*
        > 
        
        404 Not Found
        
        

    404 Not Found


    nginx * Connection #0 to host search.paytm.com left intact

    来自不同主机提供商(以及我自己的机器)的结果是:

    
        [ps527167]$ curl -v https://search.paytm.com/search/?page_count=1&items_per_page=10&quality=high&curated=1&cat_tree=1&from=organic&channel=web&version=2&userQuery=iphone
        [1] 26241
        [2] 26242
        [3] 26243
        [4] 26244
        [5] 26245
        [6] 26246
        [7] 26247
        [8] 26248
        [2]   Done                    items_per_page=10
        [3]   Done                    quality=high
        [4]   Done                    curated=1
        [5]   Done                    cat_tree=1
        [6]   Done                    from=organic
        [7]-  Done                    channel=web
        [ps527167]$ * Hostname was NOT found in DNS cache
        *   Trying 23.218.97.132...
        * Connected to search.paytm.com (23.218.97.132) port 443 (#0)
        * successfully set certificate verify locations:
        *   CAfile: none
          CApath: /etc/ssl/certs
        * SSLv3, TLS handshake, Client hello (1):
        * SSLv3, TLS handshake, Server hello (2):
        * SSLv3, TLS handshake, CERT (11):
        * SSLv3, TLS handshake, Server key exchange (12):
        * SSLv3, TLS handshake, Server finished (14):
        * SSLv3, TLS handshake, Client key exchange (16):
        * SSLv3, TLS change cipher, Client hello (1):
        * SSLv3, TLS handshake, Finished (20):
        * SSLv3, TLS change cipher, Client hello (1):
        * SSLv3, TLS handshake, Finished (20):
        * SSL connection using ECDHE-RSA-AES256-GCM-SHA384
        * Server certificate:
        *    subject: C=IN; ST=Uttar Pradesh; L=Noida; O=One 97 Communications Limited; CN=secure.paytm.in
        *    start date: 2015-10-29 00:00:00 GMT
        *    expire date: 2016-10-28 23:59:59 GMT
        *    subjectAltName: search.paytm.com matched
        *    issuer: C=US; O=GeoTrust Inc.; CN=GeoTrust SSL CA - G3
        *    SSL certificate verify ok.
        > GET /search/?page_count=1 HTTP/1.1
        > User-Agent: curl/7.35.0
        > Host: search.paytm.com
        > Accept: */*
    
         HTTP/1.1 200 OK
         Content-Type: application/json; charset=utf-8
         Server openresty is not blacklisted
         Server: openresty
         Strict-Transport-Security: max-age=31536000
         Strict-Transport-Security: max-age=31536000
         X-Frame-Options: SAMEORIGIN
         X-PAYTM-SRV-ID: pawslmktsearchapp04
         Date: Mon, 06 Jun 2016 15:22:18 GMT
         Content-Length: 907
         Connection: keep-alive
         Connection #0 to host search.paytm.com left intact
    
        {"result_type":"grid","default_sorting_param":"sort_relevance=1","meta":{"version":"1.0.0","query":null,"category":[],"mappedQuery":null},"sorting_keys":[{"name":"Relevance","urlParams":"sort_relevance","default":"sort_relevance=1"},{"name":"New","urlParams":"sort_new","default":"sort_new=1"},{"name":"Price","urlParams":"sort_price","default":"sort_price=0"}],"frontend_filters":[],"filters":[{"title":"In-Stock","values":[{"id":"0","name":"0"},{"id":"1","name":"In-Stock"}],"filter_param":"availability","type":"boolean"}],"search_suggestion":null,"has_more":false,"total_count":0,"grid_layout":[],"featured_products":[],"related_searches":[],"search_user_id":"eyJhbGciOiJIUzI1NiJ9.NjZmN2M3OWMtODllZC00OTRjLWI2MDYtNjNiODhlNWE4MTZi.7RvVFv4nDejWf4joXgy_TMdVaaQRTC2F-QPP6FvU_gQ","search_id":"eyJhbGciOiJIUzI1NiJ9.N2JhYzJkNjYtZWE4Ny00OTVmLWE2NzktMTlmM2QwNzY2OWI2.o9dbkUxVEWk8j8PdZnafMzxuAIzubTY9TJFiwyhEy8Q"}
        [1]-  Done                    curl -v https://search.paytm.com/search/?page_count=1
        [8]+  Done                    version=2
    
    

    真的难以理解该做什么,尝试搜索stackoverflow到核心,发现很少的解决方案,如设置cookie或用户代理,但两者都不起作用,如果你看到其他主机的用户代理字符串也设置相同但它返回正确的响应,甚至它运行相同版本的ubuntu。如果我能从社区获得任何帮助,那将是非常棒的。

    为了把它包起来,我在ubuntu,终端,ssh等方面的技能水平刚刚发现,所以你可能不得不放下条款并以基本方式提供帮助......对不起。

4 个答案:

答案 0 :(得分:1)

search.paytm.com正在Akamai CDN上托管,因此不同的客户端将路由到不同的服务器。请注意,您的第一个连接已转至96.6.72.42,第二个连接转至23.218.97.132。当我查找主机名时,我得到不同的IP,在104.97.19.5323.203.115.39之间交替;我位于波士顿使用康卡斯特,traceroute显示这些位于波士顿和纽约市的康卡斯特数据中心。

由于某种原因,您想要的页面仅位于第二台服务器上。这是他们配置的问题,您需要联系他们。

答案 1 :(得分:1)

你的问题是你没有逃脱你的网址。 &对于shell来说意味着什么。

尝试以下方法:

curl -v 'https://search.paytm.com/search/?page_count=1&items_per_page=10&quality=high&curated=1&cat_tree=1&from=organic&channel=web&version=2&userQuery=iphone'

线索是这个输出:

[1] 26241
[2] 26242
[3] 26243
[4] 26244
[5] 26245
[6] 26246
[7] 26247
[8] 26248
[2]   Done                    items_per_page=10
[3]   Done                    quality=high
[4]   Done                    curated=1
[5]   Done                    cat_tree=1
[6]   Done                    from=organic
[7]-  Done                    channel=web

这些是shell后台作业。它试图运行以下命令并将它们放入后台:

$ curl -v 'https://search.paytm.com/search/?page_count=1 &
$ items_per_page=10 &
$ quality=high &
[... and so on ...]

答案 2 :(得分:0)

2个系统在同一主机名上解析为2个完全不同的IP范围这一事实表明目标IP最近已更改,但您的系统失败仍然将旧IP作为DNS响应。

建议使用Google的免费DNS服务器.. https://developers.google.com/speed/public-dns/docs/using#linux

答案 3 :(得分:0)

事实证明,AWS的整个IP范围都被列入黑名单,这就是弹出错误的原因。