当我转到某个网页时,我会看到很多要抓取的数据。当我“显示源”数据不存在时,所以我知道它是动态进入的。
要了解Chrome浏览器实际发送的数据以获取数据,我按F12键打开调试控制台。然后,我转到“网络”选项卡,然后进入“ XHR”子选项卡。现在,我看到了请求列表。我单击“预览”选项卡中显示所需数据的那个。然后,我右键单击该请求,然后选择“复制”->“复制请求标头”。
这是我粘贴到文本文件中时显示的内容:
Resources:
ECSTrigger:
Type: AWS::Events::Rule
Properties:
...
Targets: # target of trigger: ECS
- Arn:
Fn::Sub: 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}'
Id: 'EcsTriggerTarget'
InputTransformer:
InputPathsMap:
s3_bucket: "$.detail.requestParameters.bucketName"
s3_key: "$.detail.requestParameters.key"
InputTemplate: '{"containerOverrides": [{"environment": [{"name": "S3_BUCKET", "value": <s3_bucket>}, {"name": "S3_KEY", "value": <s3_key>}]}]}'
EcsParameters:
LaunchType: FARGATE
PlatformVersion: LATEST
TaskCount: 1
TaskDefinitionArn:
Ref: Task
NetworkConfiguration:
AwsVpcConfiguration:
AssignPublicIp: DISABLED
SecurityGroups: ...
Subnets: ...
RoleArn:
Fn::GetAtt: EcsTriggerRole.Arn
EcsTriggerRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action: 'sts:AssumeRole'
Principal:
Service: 'events.amazonaws.com'
ManagedPolicyArns:
- Fn::Sub: 'arn:${AWS::Partition}:iam::aws:policy/service-role/AmazonEC2ContainerServiceEventsRole'
接下来,我在这里按照教程进行操作: https://www.baeldung.com/java-http-request
我的Java代码如下:
GET /index.cfm?zaction=AUCTION&Zmethod=UPDATE&FNC=LOAD&AREA=W&PageDir=0&doR=1&tx=1563305131817&bypassPage=1&test=1&_=1563305131817 HTTP/1.1
Host: charlotte.realforeclose.com
Connection: keep-alive
Accept: application/json, text/javascript, */*; q=0.01
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36
Origin: http://evil.com/
Referer: https://charlotte.realforeclose.com/index.cfm?zaction=AUCTION&Zmethod=PREVIEW&AUCTIONDATE=07/23/2019
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: cfid=6f228aa1-bb7e-4734-92ff-39eabf23ed9b; cftoken=0; AWSELB=E7779D5F1C1F6ABE3513A5C5B6B0C754520B66675A407900314ABAC5333A52E93FD1A8D7401D89BC8D5E8B98059C8AAC5507D12A2C6ED07F7E7CB77311BD7FB09B738DB945; _ga=GA1.2.1823487290.1563231012; _gid=GA1.2.1418453663.1563231012; _gcl_au=1.1.273755450.1563231013; __utmc=65865852; __utmz=65865852.1563231014.1.1.utmcsr=realauction.com|utmccn=(referral)|utmcmd=referral|utmcct=/client-sites; CF_CLIENT_CHARLOTTE_REALFORECLOSE_TC=1563285507505; __utma=65865852.1823487290.1563231012.1563300430.1563305081.3; __utmt_UA-51657054-1=1; _gat=1; testcookiesenabled=enabled; CF_CLIENT_CHARLOTTE_REALFORECLOSE_LV=1563305130572; CF_CLIENT_CHARLOTTE_REALFORECLOSE_HC=1454; __utmb=65865852.4.10.1563305081
但是我的输出被编码为乱码。我这样做对吗?如果是这样,我该如何解码?