如何使用Java模拟浏览器对动态数据的请求?

时间:2019-07-16 18:06:00

标签: java cookies httpurlconnection

当我转到某个网页时,我会看到很多要抓取的数据。当我“显示源”数据不存在时,所以我知道它是动态进入的。

要了解Chrome浏览器实际发送的数据以获取数据,我按F12键打开调试控制台。然后,我转到“网络”选项卡,然后进入“ XHR”子选项卡。现在,我看到了请求列表。我单击“预览”选项卡中显示所需数据的那个。然后,我右键单击该请求,然后选择“复制”->“复制请求标头”。

这是我粘贴到文本文件中时显示的内容:

Resources:
  ECSTrigger:
    Type: AWS::Events::Rule
    Properties:
      ...
      Targets: # target of trigger: ECS
        - Arn:
            Fn::Sub: 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}'
          Id: 'EcsTriggerTarget'
          InputTransformer:
            InputPathsMap:
              s3_bucket: "$.detail.requestParameters.bucketName"
              s3_key: "$.detail.requestParameters.key"
            InputTemplate: '{"containerOverrides": [{"environment": [{"name": "S3_BUCKET", "value": <s3_bucket>}, {"name": "S3_KEY", "value": <s3_key>}]}]}'
          EcsParameters:
            LaunchType: FARGATE
            PlatformVersion: LATEST
            TaskCount: 1
            TaskDefinitionArn:
              Ref: Task
            NetworkConfiguration:
              AwsVpcConfiguration:
                AssignPublicIp: DISABLED
                SecurityGroups: ...
                Subnets: ...
          RoleArn:
            Fn::GetAtt: EcsTriggerRole.Arn

  EcsTriggerRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Action: 'sts:AssumeRole'
            Principal:
              Service: 'events.amazonaws.com'
      ManagedPolicyArns:
        - Fn::Sub: 'arn:${AWS::Partition}:iam::aws:policy/service-role/AmazonEC2ContainerServiceEventsRole'

接下来,我在这里按照教程进行操作: https://www.baeldung.com/java-http-request

我的Java代码如下:

GET /index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=W&PageDir=0&doR=1&tx=1563305131817&bypassPage=1&test=1&_=1563305131817 HTTP/1.1
Host: charlotte.realforeclose.com
Connection: keep-alive
Accept: application/json, text/javascript, */*; q=0.01
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36
Origin: http://evil.com/
Referer: https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=PREVIEW&AUCTIONDATE=07/23/2019
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: cfid=6f228aa1-bb7e-4734-92ff-39eabf23ed9b; cftoken=0; AWSELB=E7779D5F1C1F6ABE3513A5C5B6B0C754520B66675A407900314ABAC5333A52E93FD1A8D7401D89BC8D5E8B98059C8AAC5507D12A2C6ED07F7E7CB77311BD7FB09B738DB945; _ga=GA1.2.1823487290.1563231012; _gid=GA1.2.1418453663.1563231012; _gcl_au=1.1.273755450.1563231013; __utmc=65865852; __utmz=65865852.1563231014.1.1.utmcsr=realauction.com|utmccn=(referral)|utmcmd=referral|utmcct=/client-sites; CF_CLIENT_CHARLOTTE_REALFORECLOSE_TC=1563285507505; __utma=65865852.1823487290.1563231012.1563300430.1563305081.3; __utmt_UA-51657054-1=1; _gat=1; testcookiesenabled=enabled; CF_CLIENT_CHARLOTTE_REALFORECLOSE_LV=1563305130572; CF_CLIENT_CHARLOTTE_REALFORECLOSE_HC=1454; __utmb=65865852.4.10.1563305081

但是我的输出被编码为乱码。我这样做对吗?如果是这样,我该如何解码?

这是输出的屏幕截图,因为它不会粘贴 enter image description here

0 个答案:

没有答案