curl vs. wget产生不同的重定向和结果

时间:2014-07-02 21:37:38

标签: http redirect curl wget

以下网址已在另一个问题中发布。

使用wget,您可以按预期获得csv文件,但curl最终会将您重定向到不同的位置。我想知道两个命令之间的区别是什么,或者如何在curl中获得相同的结果。

wget的

wget --output-document=test.csv --no-check-certificate 'https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv'

卷曲

curl --location --insecure --output test.csv 'https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv'

更新了标题信息

标题比较

wget 1

--2014-07-03 09:54:30--  https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv
Resolving docs.google.com... 74.125.226.98, 74.125.226.100, 74.125.226.102, ...
Connecting to docs.google.com|74.125.226.98|:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 302 Moved Temporarily
  Content-Type: text/html; charset=UTF-8
  Cache-Control: no-cache, no-store, max-age=0, must-revalidate
  Pragma: no-cache
  Expires: Fri, 01 Jan 1990 00:00:00 GMT
  Date: Thu, 03 Jul 2014 13:54:30 GMT
  X-Robots-Tag: noindex, nofollow, nosnippet
  Location: https://www.google.com/url?q=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&sa=p
  Set-Cookie: NID=67=D4vu38cFuNFB-qdFSdaVBpLKJ94VcnpcVDfEpoyECGG-EesJlxBW4Rwb-AA-XAF7ztGOAIzx3u2YYqwRlt516cv3i6jSa9Pazf3uK-hyR5p5QoEYaZ-MqRpj9H_utLwU;Domain=.google.com;Path=/;Expires=Fri, 02-Jan-2015 13:54:30 GMT;HttpOnly
  P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
  X-Content-Type-Options: nosniff
  X-XSS-Protection: 1; mode=block
  Server: GSE
  Alternate-Protocol: 443:quic
  Transfer-Encoding: chunked
Location: https://www.google.com/url?q=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&sa=p [following]

卷曲1

HTTP/1.1 302 Moved Temporarily
Content-Type: text/html; charset=UTF-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Thu, 03 Jul 2014 13:59:48 GMT
X-Robots-Tag: noindex, nofollow, nosnippet
Location: https://www.google.com/url?q=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&sa=p
Set-Cookie: NID=67=QTFWWFkySepW985crZ2dZk1JfQ8gGj_H59HwYp-SMcOvYl0J4JU3VfDGCqppxFcEPt-e48qr0yJOx2ImUKH65LlgvuLyF3Ec842bPFq-BFg9a7YWEP_5Uq8YJrJ58taL;Domain=.google.com;Path=/;Expires=Fri, 02-Jan-2015 13:59:48 GMT;HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Server: GSE
Transfer-Encoding: chunked

wget 2

--2014-07-03 09:54:30--  https://www.google.com/url?q=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&sa=p
Resolving www.google.com... 74.125.225.144, 74.125.225.145, 74.125.225.148, ...
Connecting to www.google.com|74.125.225.144|:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 302 Found
  X-Frame-Options: ALLOWALL
  Location: https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv&pref=2
  Cache-Control: private
  Content-Type: text/html; charset=UTF-8
  Set-Cookie: PREF=ID=1f6208c8ba0c71f9:FF=0:TM=1404395670:LM=1404395670:S=HaS679Z5xbmJBKs7; expires=Sat, 02-Jul-2016 13:54:30 GMT; path=/; domain=.google.com
  Date: Thu, 03 Jul 2014 13:54:30 GMT
  Server: gws
  Content-Length: 311
  X-XSS-Protection: 1; mode=block
  Alternate-Protocol: 443:quic
Location: https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv&pref=2 [following]

卷曲2

HTTP/1.1 302 Found
X-Frame-Options: ALLOWALL
Location: https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv&pref=2
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=432f03534cff2fd2:FF=0:TM=1404395989:LM=1404395989:S=1NwOiUYJQYKfn6qF; expires=Sat, 02-Jul-2016 13:59:49 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=EjeYW1PP63Nxk5upQVhEVreT_prZXQYQy4WVKZCHkY3cXffcTWyvXIJkt4Tg07LUoHo3GSkEg6qDh5ff5ESGhksbjT50ytYRd0SyKp7quyorpbT4GMhnbORlkFfTNdRc; expires=Fri, 02-Jan-2015 13:59:49 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Thu, 03 Jul 2014 13:59:49 GMT
Server: gws
Content-Length: 311
X-XSS-Protection: 1; mode=block
Alternate-Protocol: 443:quic

wget 3

--2014-07-03 09:54:31--  https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv&pref=2
Connecting to docs.google.com|74.125.226.98|:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 302 Moved Temporarily
  Content-Type: text/html; charset=UTF-8
  Location: https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv
  Date: Thu, 03 Jul 2014 13:54:31 GMT
  Expires: Thu, 03 Jul 2014 13:54:31 GMT
  Cache-Control: private, max-age=0
  X-Content-Type-Options: nosniff
  X-Frame-Options: SAMEORIGIN
  X-XSS-Protection: 1; mode=block
  Server: GSE
  Alternate-Protocol: 443:quic
  Transfer-Encoding: chunked
Location: https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv [following]

卷曲3

HTTP/1.1 302 Moved Temporarily
Content-Type: text/html; charset=utf-8
Location: https://www.google.com/accounts/ServiceLogin?service=wise&passive=1209600&continue=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&followup=https://docs.google.com/spreadsheet/ccc?key%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&ltmpl=sheets
Content-Length: 2270
Set-Cookie: NID=67=NdTD41weGlHPUtsUMwF0a7ugZ5Hfof3Q8CFsy2gQcJuBaH8ugZIYppe2PWWhP5fEMtdToEi76-lQH_lAJUeLEkNo1nObesgzEnSSg3HEJeb-5vYrAs4fwR7bM7Ourxeh;Domain=.google.com;Path=/;Expires=Fri, 02-Jan-2015 13:59:49 GMT;HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Thu, 03 Jul 2014 13:59:49 GMT
Expires: Thu, 03 Jul 2014 13:59:49 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE

wget 4(最终)

--2014-07-03 09:54:31--  https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv
Reusing existing connection to docs.google.com:443.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Content-Type: text/csv; charset=utf-8
  Cache-Control: no-cache, no-store, max-age=0, must-revalidate
  Pragma: no-cache
  Expires: Fri, 01 Jan 1990 00:00:00 GMT
  Date: Thu, 03 Jul 2014 13:54:31 GMT
  X-Robots-Tag: noindex, nofollow, nosnippet
  Content-Disposition: attachment; filename="Download Test Spreadsheet.csv"
  X-Content-Type-Options: nosniff
  X-XSS-Protection: 1; mode=block
  Server: GSE
  Alternate-Protocol: 443:quic
  Transfer-Encoding: chunked

卷曲4

HTTP/1.1 302 Moved Temporarily
Content-Type: text/html; charset=UTF-8
Location: https://accounts.google.com/ServiceLogin?service=wise&passive=1209600&continue=https%3A%2F%2Fdocs.google.com%2Fspreadsheet%2Fccc%3Fkey%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&followup=https%3A%2F%2Fdocs.google.com%2Fspreadsheet%2Fccc%3Fkey%3D0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc%26output%3Dcsv%26pref%3D2&ltmpl=sheets
Content-Length: 556
Date: Thu, 03 Jul 2014 13:59:49 GMT
Expires: Thu, 03 Jul 2014 13:59:49 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Server: GSE
Alternate-Protocol: 443:quic

卷曲5(最终)

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=10893354; includeSubDomains
Set-Cookie: GAPS=1:v3eXsN1lqmN5ryz1eyf2iMBP2uoIGg:wiYHYyLrGeoRHUfk;Path=/;Expires=Sat, 02-Jul-2016 13:59:49 GMT;Secure;HttpOnly;Priority=HIGH
X-Frame-Options: DENY
Date: Thu, 03 Jul 2014 13:59:49 GMT
Expires: Thu, 03 Jul 2014 13:59:49 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Length: 0
Server: GSE
Alternate-Protocol: 443:quic

1 个答案:

答案 0 :(得分:1)

一个很好的调试技术是打开该链接,同时在chrome中打开开发人员工具栏并查看网络选项卡。可以右键单击该选项卡中的所有请求以显示用于下载该信息的cURL命令。

在您的情况下,问题似乎是wget正在为您处理cookie,而cURL则没有。这应该很容易解决:

curl 'https://docs.google.com/spreadsheet/ccc?key=0At2sqNEgxTf3dEt5SXBTemZZM1gzQy1vLVFNRnludHc&output=csv' --location --cookie tmp.cookie
# Foo,Bar,Baz
# 1,2,3
# 4,5,6