提取nike.com的源代码时,为什么会出现“访问被拒绝”错误?

时间:2019-04-27 16:35:49

标签: c++ windows file web-scraping sfml

我正在制作一个简单的程序,在其中键入链接,它会获取该链接的源代码。当我运行它时,它可以与某些链接(特别是nike.com链接)以外的链接完美配合。代码如下所示:

#include <wininet.h>
#include <iostream>
#include <conio.h>
#include <fstream.h>
fstream fs_obj;
using namespace std;

int main(int argc, char *argv[])
{

  fs_obj.open("temp.txt",ios::out | ios::app);  
  HINTERNET hInternet = InternetOpenA("InetURL/1.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0 );

  HINTERNET hConnection = InternetConnectA( hInternet, "https://www.nike.com/t/air-max-97-on-air-jasmine-lasode-shoe-jMqQbs", 80, " "," ", INTERNET_SERVICE_HTTP, 0, 0 ); //enter url here

  HINTERNET hData = HttpOpenRequestA( hConnection, "GET", "/", NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0 );

  char buf[ 2048 ] ;

  HttpSendRequestA( hData, NULL, 0, NULL, 0 ) ;
  string total;
  DWORD bytesRead = 0 ;
  DWORD totalBytesRead = 0 ;

  while( InternetReadFile( hData, buf, 2000, &bytesRead ) && bytesRead != 0 )
  {
    buf[ bytesRead ] = 0 ; // insert the null terminator.
    total=total+buf;
    printf( "%d bytes read\n", bytesRead ) ;

    totalBytesRead += bytesRead ;
  }

  fs_obj<<total<<"\n--------------------end---------------------\n";
  fs_obj.close();
  printf( "\n\n END -- %d bytes read\n", bytesRead ) ;
  printf( "\n\n END -- %d TOTAL bytes read\n", totalBytesRead ) ;

  cout<<endl<<total<<endl; //it will save source code to (temp.txt) file
  InternetCloseHandle( hData ) ;
  InternetCloseHandle( hConnection ) ;
  InternetCloseHandle( hInternet ) ;
  system("pause");
}

我认为它会产生一个包含确切源代码的文件,但在temp.txt中看到了这一点:

<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;nike&#46;com&#47;t&#47;air&#45;vapormax&#45;2019&#45;mens&#45;shoe&#45;wr4C0z&#47;AR6631&#45;007" on this server.<P>
Reference&#32;&#35;18&#46;d6fd241&#46;1556374977&#46;d088ca9
</BODY>
</HTML>

0 个答案:

没有答案