有没有办法从网络网址或runpipe
外部命令读取表格?看来DataFrame.readtable只支持从文件中读取。
例如在R中,我们可以这样做:
df = read.table(url("http://example.com/data.txt"))
x = read.table(pipe("zcat data.txt | sed /^#/d | cut -f '11-13'"), colClasses=c("integer","integer","integer"), fill=TRUE, row.names=NULL)
答案 0 :(得分:7)
using DataFrames, Requests
julia> resp = get("https://data.cityofnewyork.us/api/views/kku6-nxdu/rows.csv?accessType=DOWNLOAD")
Response(200 OK, 17 headers, 27350 bytes in body)
julia> tbl = readtable(IOBuffer(resp.data));
julia> names(tbl)
46-element Array{Symbol,1}:
:JURISDICTION_NAME
:COUNT_PARTICIPANTS
:COUNT_FEMALE
:PERCENT_FEMALE
:COUNT_MALE
:PERCENT_MALE
:COUNT_GENDER_UNKNOWN
:PERCENT_GENDER_UNKNOWN
:COUNT_GENDER_TOTAL
:PERCENT_GENDER_TOTAL
:COUNT_PACIFIC_ISLANDER
:PERCENT_PACIFIC_ISLANDER
:COUNT_HISPANIC_LATINO
:PERCENT_HISPANIC_LATINO
:COUNT_AMERICAN_INDIAN
:PERCENT_AMERICAN_INDIAN
:COUNT_ASIAN_NON_HISPANIC
⋮
:PERCENT_PERMANENT_RESIDENT_ALIEN
:COUNT_US_CITIZEN
:PERCENT_US_CITIZEN
:COUNT_OTHER_CITIZEN_STATUS
:PERCENT_OTHER_CITIZEN_STATUS
:COUNT_CITIZEN_STATUS_UNKNOWN
:PERCENT_CITIZEN_STATUS_UNKNOWN
:COUNT_CITIZEN_STATUS_TOTAL
:PERCENT_CITIZEN_STATUS_TOTAL
:COUNT_RECEIVES_PUBLIC_ASSISTANCE
:PERCENT_RECEIVES_PUBLIC_ASSISTANCE
:COUNT_NRECEIVES_PUBLIC_ASSISTANCE
:PERCENT_NRECEIVES_PUBLIC_ASSISTANCE
:COUNT_PUBLIC_ASSISTANCE_UNKNOWN
:PERCENT_PUBLIC_ASSISTANCE_UNKNOWN
:COUNT_PUBLIC_ASSISTANCE_TOTAL
:PERCENT_PUBLIC_ASSISTANCE_TOTAL
julia> eltypes(tbl)
46-element Array{Type,1}:
Int64
Int64
Int64
Float64
Int64
Float64
Int64
Int64
Int64
Int64
Int64
Float64
Int64
Float64
Int64
Float64
Int64
⋮
Float64
Int64
Float64
Int64
Float64
Int64
Int64
Int64
Int64
Int64
Float64
Int64
Float64
Int64
Int64
Int64
Int64
答案 1 :(得分:0)
在Requests
弃用HTTP
后,这里有一个关于如何使用HTTP.request
和body
res
次调用的示例请求。
julia> using CSV, HTTP
julia> res = HTTP.request("GET", "http://users.csc.calpoly.edu/~dekhtyar/365-Winter2015/data/CARS/cars-data.csv")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Date: Wed, 16 May 2018 12:46:39 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Mon, 05 Jan 2015 23:29:09 GMT
ETag: "330f-50bf00ea05b40"
Accept-Ranges: bytes
Content-Length: 13071
Content-Type: text/csv
Id,MPG,Cylinders,Edispl,Horsepower,Weight,Accelerate,Year
1,18,8,307,130,3504,12,1970
2,15,8,350,165,3693,11.5,1970
3,18,8,318,150,3436,11,1970
⋮
13071-byte body
"""
julia> res_buffer = IOBuffer(res.body)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=13071, maxsize=Inf, ptr=1, mark=-1)
julia> using DataFrames, DataStreams
julia> df = CSV.read(res_buffer)
406×8 DataFrames.DataFrame
│ Row │ Id │ MPG │ Cylinders │ Edispl │ Horsepower │ Weight │ Accelerate │ Year │
├─────┼─────┼─────┼───────────┼────────┼────────────┼────────┼────────────┼──────┤
│ 1 │ 1 │ 18 │ 8 │ 307.0 │ 130 │ 3504 │ 12.0 │ 1970 │
│ 2 │ 2 │ 15 │ 8 │ 350.0 │ 165 │ 3693 │ 11.5 │ 1970 │
│ 3 │ 3 │ 18 │ 8 │ 318.0 │ 150 │ 3436 │ 11.0 │ 1970 │
⋮
│ 405 │ 405 │ 28 │ 4 │ 120.0 │ 79 │ 2625 │ 18.6 │ 1982 │
│ 406 │ 406 │ 31 │ 4 │ 119.0 │ 82 │ 2720 │ 19.4 │ 1982 │