Curl工作,JSOUP返回HTTP错误500

时间:2016-02-02 22:25:41

标签: curl jsoup

我尝试使用Java进行网络搜索,并计划最终将此代码投入Android,因此目前我正在尝试使用JSOUP。使用Chrome的DevTools,我提取了请求标头和curl命令以从网页返回数据。我可以在curl中运行以下命令,它可以工作:

curl 'mySite/campaign/List' -H 'Cookie: __RequestVerificationToken_L0N5YXJhV2ViUG9ydGFs0=IECNY-SOnB09IY9MQMm3xL1bSbASe8Eha9J1fWupurHtmlldojgqpaljhzIuhfFh6zRnOygjsrKyuhj2krWiSSNXif76gRNH_39lGvyMJ0I1; ASP.NET_SessionId=gojtobwzycl0lvs0ip4glf3n; myCompany.WEB.PORTAL.AUTH=40C13BAF08884380F805B99E217754F3D35920CE1861DEBB580DC143DA4249C4682C33A36DD29272A3A844880110E4D0EC1F24298E4D1B2A4A94E3FA2CAC08B934989ACF155616D6CB5665338FF3CFF82EAD87BF93EB46FA3BA6AAE6B00401F9' -H 'Origin: mySite' -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36' -H 'Content-Type: application/json;charset=UTF-8' -H 'Accept: */*' -H 'Referer: mySite/campaign' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' -H '__RequestVerificationToken: G2RD7FtHMG12j00zNuLtiSZSWquXAOvh1hUNxObxMCFIZclrQueAo4d3cZonI1MZ7hxELl56yi5hci5vpC78m4Sh8PivHwRcKImcCibi9xk1' --data-binary '{"PageNumber":2,"SortColumn":"ScheduledRunDate","SortAscending":false,"PageSize":20,"CollectionSize":308,"SelectedAccountId":"1","SearchTerm":"","ShowInactive":true}' --compressed

我还从Chrome DevTools中提取了标头请求标头:

POST mySite/campaign/List HTTP/1.1
Host: mySite
Connection: keep-alive
Content-Length: 165
Origin: mySite
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36
Content-Type: application/json;charset=UTF-8
Accept: */*
X-Requested-With: XMLHttpRequest
__RequestVerificationToken: G2RD7FtHMG12j00zNuLtiSZSWquXAOvh1hUNxObxMCFIZclrQueAo4d3cZonI1MZ7hxELl56yi5hci5vpC78m4Sh8PivHwRcKImcCibi9xk1
Referer: mySite/campaign
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Cookie: __RequestVerificationToken_L0N5YXJhV2ViUG9ydGFs0=IECNY-SOnB09IY9MQMm3xL1bSbASe8Eha9J1fWupurHtmlldojgqpaljhzIuhfFh6zRnOygjsrKyuhj2krWiSSNXif76gRNH_39lGvyMJ0I1; ASP.NET_SessionId=gojtobwzycl0lvs0ip4glf3n; myCompany.WEB.PORTAL.AUTH=40C13BAF08884380F805B99E217754F3D35920CE1861DEBB580DC143DA4249C4682C33A36DD29272A3A844880110E4D0EC1F24298E4D1B2A4A94E3FA2CAC08B934989ACF155616D6CB5665338FF3CFF82EAD87BF93EB46FA3BA6AAE6B00401F9

然后我尝试将其转换为jsoup并且没有运气。我尝试只使用标题,并使用标题以及PageNumber,ScheduledRunDate等传递。两次尝试都返回org.jsoup.HttpStatusException:HTTP错误提取URL。状态= 500。这是我尝试的代码:

Document pageDoc = Jsoup.connect("mySite/campaign/List")
                .cookies(loginCookies)
                //.header("Cookie",cookieList)
                .userAgent("Mozilla/5.0")
                .referrer("mySite/campaign")
                //.data("Username", username)
                //.data("Password", password)
                //.followRedirects(true)
                .header("Accept","*/*")
                .header("Accept-Encoding","gzip, deflate")
                .header("Accept-Language","en-US,en;q=0.8")
                .header("Connection","keep-alive")
                .header("Content-Type", "application/json;charset=UTF-8")
                .header("Host","mySite")
                .header("Origin", "mySite")
                .header("Referer","mySite/campaign")
                .header("User-Agent","Mozilla/5.0 (Windows NT 6.1: WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36")
                .header("X-Requested-With", "XMLHttpRequest")
                .header("__RequestVerificationToken", pageToken)
                .header("Content-Length", "165") //not sure if needed. If it is, no idea how to get
                .data("PageNumber","2")
                .data("SortColumn", "ScheduledRunDate")
                .data("SortAscending", "false")
                .data("PageSize", "20")
                .data("CollectionSize", "308")
                .data("SelectedAccountId", "1")
                .data("SearchTerm", "")
                .data("ShowInactive", "true")               
                .ignoreContentType(true)
                .post();

我可以确认我的所有代币都是正确的。当我注释掉.header(" X-Requested-With"," XMLHttpRequest")时,我会收到一般错误页面(这是预期的)所以我知道我正在连接,但是当我离开它时,我得到500.我也可以确认所有的" mySite"链接是正确的,我只需要为我的公司删除它们。我也不确定是否以及如何为jsoup添加PageNumber,SortColumn,SortAscending等,所以我只是盲目地将它们添加为上面显示的数据参数。

1 个答案:

答案 0 :(得分:0)

尝试删除header("Content-Length", "165").header("Content-Type", "application/json;charset=UTF-8")。 Jsoup可以为你添加它们。

也尝试使用FormElement。请参阅此FormElement example