如何使用boto3提交Mechanical Turk ExternalQuestions

时间:2017-10-11 15:31:03

标签: boto3

我正在尝试使用boto3以编程方式在机械turk上创建一个问题,但我似乎做错了,因为ExternalQuestion所需的create_hit数据结构似乎缺失了。

我尝试像这样创建HIT:

import boto3

#...

client = boto3.client(
    'mturk',
    endpoint_url=endpoint_url,
    region_name=region_name,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

question = ExternalQuestion(external_url=question_target, frame_height=800)

response = client.create_hit(
        MaxAssignments=10,
        Title='Test',
        Description='This is a test of ExternalQuestion',
        Question=question,
        AssignmentDurationInSeconds=60,
        LifetimeInSeconds=24 * 60 * 60,
        Reward=0.01)

失败了:

Traceback (most recent call last):
  File "createTask.py", line 21, in <module>
    question = ExternalQuestion(external_url=question_target, frame_height=800)
NameError: name 'ExternalQuestion' is not defined

高度赞赏任何关于如何进行的建议。

3 个答案:

答案 0 :(得分:0)

这是我的生产代码的直接剪辑。我打开一个XML文件,您可以从请求者站点获取模板,然后只是修改它以包含您自己的javascript和html。我将在下面附上一个样本。

<强>的Python

import boto3
region_name = 'us-east-1'
aws_access_key_id = '*********************'
aws_secret_access_key = '*********************'
endpoint_url = 'https://mturk-requester-sandbox.us-east-1.amazonaws.com'

# Uncomment this line to use in production
#endpoint_url = 'https://mturk-requester.us-east-1.amazonaws.com'
client = boto3.client(
    'mturk',
    endpoint_url=endpoint_url,
    region_name=region_name,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)
questionSampleFile = open("K:/" + str(somefile) + ".xml", "r")
questionSample = questionSampleFile.read()

localRequirements = [{
    'QualificationTypeId': '00000000000000000071',
    'Comparator': 'NotIn',
    'LocaleValues': [{
     'Country': 'WF'
   }],
   'RequiredToPreview': True
    }]
xReward = '0.25'
# Create the HIT 
response = client.create_hit(
    MaxAssignments = 1,
    #AutoApprovalDelayInSeconds = 259200,
    #3 days for lifetime
    LifetimeInSeconds = 172800,
    #1 hour to finish the assignment
    AssignmentDurationInSeconds = 5400,
    Reward = xReward,
    Title = 'Enter Missing Data',
    Keywords = 'data entry, typing, inspection',
    Description = 'Edit and Add Data from PDF',
    Question = questionSample,
    QualificationRequirements = localRequirements
)

<强> XML

<HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
  <HTMLContent><![CDATA[

]]>
  </HTMLContent>
  <FrameHeight>900</FrameHeight>
</HTMLQuestion>

答案 1 :(得分:0)

如果仍然安装了较旧版本的df.col,最简单的方法是使用boto的{​​{1}}方法:

get_as_xml()

如果查看ExternalQuestion的输出,您会发现它非常简单,您可以自己生成它:

import boto3
from boto.mturk.question import ExternalQuestion

mturk = boto3.client(
    'mturk',
    endpoint_url='https://mturk-requester-sandbox.us-east-1.amazonaws.com',
    region_name='us-east-1',
    aws_access_key_id='your_access_key',
    aws_secret_access_key='your_secret_key',
)

question = ExternalQuestion("https://example.com/mypage.html", frame_height=600)
new_hit = mturk.create_hit(
    Title='Answer a simple question',
    Description='Help research a topic',
    Keywords='question, answer, research',
    Reward='0.15',
    MaxAssignments=1,
    LifetimeInSeconds=172800,
    AssignmentDurationInSeconds=600,
    AutoApprovalDelayInSeconds=14400,
    Question=question.get_as_xml(),   # <--- this does the trick
)
print "HITID = " + new_hit['HIT']['HITId']

您需要确保在问题URL中转义字符,以便它是有效的XML文件。

答案 2 :(得分:0)

如果您要寻找与经典boto中相同的类接口,请使用以下独立的代码段进行模仿:

class ExternalQuestion:
    """
    An object for constructing an External Question.
    """
    schema_url = "http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd"
    template = '<ExternalQuestion xmlns="%(schema_url)s"><ExternalURL>%%(external_url)s</ExternalURL><FrameHeight>%%(frame_height)s</FrameHeight></ExternalQuestion>' % vars()

    def __init__(self, external_url, frame_height):
        self.external_url = external_url
        self.frame_height = frame_height

    def get_as_params(self, label='ExternalQuestion'):
        return {label: self.get_as_xml()}

    def get_as_xml(self):
        return self.template % vars(self)