CKAN提供ckanapi
包,用于通过Python或命令行访问the CKAN API。
我可以使用它来下载元数据,创建资源等。但我无法在单个API调用中创建包并将资源上传到它。 (包也称为数据集。)
内部ckanapi
scans all keys moving any file-like parameters into a separate dict
,passes to the requests.session.post(files=..)
parameter。
这是我能得到的最接近但是CKAN返回HTTP 500错误(从this guide to requests
复制):
with ckanapi.RemoteCKAN('http://myckan.example.com', apikey='real-key', user_agent=ua, username='joe', password='pwd') as ckan:
ckan.action.package_create(name='joe_data',
resources=('report.xls',
open('/path/to/file.xlsx', 'rb'),
'application/vnd.ms-excel',
{'Expires': '0'}))
我还尝试了resources=open('path/file')
,files=open('file')
,更短或更长的元组,但得到了相同的500错误。
requests
文档说:
:param files: (optional) Dictionary of ``'filename': file-like-objects``
for multipart encoding upload.
我无法通过ckanapi
resources={'filename': open('file')}
因为ckanapi
没有检测到该文件,尝试将其作为正常参数传递给requests
,并且失败(" BufferedReader不是JSON可序列化的"因为它试图使文件成为POST
参数)。如果我尝试传递文件列表,我会得到相同的。但是the API is able to创建了一个包并在一次调用中添加了许多资源。
那么如何通过一次ckanapi
调用创建一个包和多个资源?
答案 0 :(得分:0)
我对此感到很好奇,并以为我会做一些测试。不幸的是,我还没有使用您提到的CLI。但是我希望这会帮助您和其他人在此绊脚石。
我不是很肯定,但我猜想您的资源字典格式不正确。资源需要是字典列表。
这是一个用于执行单个api调用插入的ruby脚本(目前是我的首选语言):
# Ruby script to create a package and resource in one api call.
# You can run this in https://repl.it/languages/ruby
# Don't forget to update URLs and API key.
require 'csv'
require 'json'
require 'net/http'
hash_to_json = {
"title" => 'test1',
"name" => 'test1',
"owner_org" => 'bbb9682e-b58c-4826-bf4b-b161581056be',
"resources" => [
{
"url" => 'http://www.resource_domain.com/doc.kml'
}
]
}.to_json
uri = URI('http://ckan_app_domain.com:5000/api/3/action/package_create')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Post.new uri
request['Authorization'] = 'user-api-key'
request.body = hash_to_json
response = http.request request
puts response.body
end
这是做相同事情的简单python脚本(感谢您为我修改的模板提供CKAN文档)
#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint
# Put the details of the dataset we're going to create into a dict.
dataset_dict = {
'name': 'my_dataset_name',
'notes': 'A long description of my dataset',
'owner_org': 'bbb9682e-b58c-4826-bf4b-b161581056be',
'resources': [
{
'url': 'example.com'
}
]
}
# Use the json module to dump the dictionary to a string for posting.
data_string = urllib.quote(json.dumps(dataset_dict))
# We'll use the package_create function to create a new dataset.
request = urllib2.Request(
'http://ckan_app_domain.com:5000/api/3/action/package_create')
# Creating a dataset requires an authorization header.
# Replace *** with your API key, from your user account on the CKAN site
# that you're creating the dataset on.
request.add_header('Authorization', 'user-api-key')
# Make the HTTP request.
response = urllib2.urlopen(request, data_string)
assert response.code == 200
# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True
# package_create returns the created package as its result.
created_package = response_dict['result']
pprint.pprint(created_package)