将HIT提交给Amazon Mechanical Turk时出现错误消息

时间:2017-04-30 17:39:33

标签: python amazon boto3 mechanicalturk

我在向Amazon Mechanical Turk沙箱提交HIT时遇到问题。

我使用以下代码提交HIT:

external_content = """"
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
  <ExternalURL>https://MY_HOST_GOES_HERE/</ExternalURL>
  <FrameHeight>400</FrameHeight>
</ExternalQuestion>
"""

import boto3

import os

region_name = 'us-east-1'

aws_access_key_id = 'MYKEY'
aws_secret_access_key = 'MYSECRETKEY'

endpoint_url = 'https://mturk-requester-sandbox.us-east-1.amazonaws.com'

# Uncomment this line to use in production
# endpoint_url = 'https://mturk-requester.us-east-1.amazonaws.com'

client = boto3.client('mturk',
                      endpoint_url=endpoint_url,
                      region_name=region_name,
                      aws_access_key_id=aws_access_key_id,
                      aws_secret_access_key=aws_secret_access_key,
                      )

# This will return $10,000.00 in the MTurk Developer Sandbox
print(client.get_account_balance()['AvailableBalance'])


response = client.create_hit(Question=external_content,
                             LifetimeInSeconds=60 * 60 * 24,
                             Title="Answer a simple question",
                             Description="Help research a topic",
                             Keywords="question, answer, research",
                             AssignmentDurationInSeconds=120,
                             Reward='0.05')

# The response included several helpful fields
hit_group_id = response['HIT']['HITGroupId']
hit_id = response['HIT']['HITId']

# Let's construct a URL to access the HIT
sb_path = "https://workersandbox.mturk.com/mturk/preview?groupId={}"
hit_url = sb_path.format(hit_group_id)

print(hit_url)

我得到的错误信息是:

botocore.exceptions.ClientError: An error occurred (ParameterValidationError) when calling the CreateHIT operation: There was an error parsing the XML question or answer data in your request.  Please make sure the data is well-formed and validates against the appropriate schema. Details: Content is not allowed in prolog. (1493572622889 s)

这可能是什么原因? xml完全同意位于亚马逊服务器上的xml架构。

外部主机返回的html是:

<!DOCTYPE html>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
<script src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js' type='text/javascript'></script>
</head>
<body>
<!-- HTML to handle creating the HIT form -->
<form name='mturk_form' method='post' id='mturk_form' action='https://workersandbox.mturk.com/mturk/externalSubmit'>
<input type='hidden' value='' name='assignmentId' id='assignmentId'/>
<!-- This is where you define your question(s) --> 
<h1>Please name the company that created the iPhone</h1>
<p><textarea name='answer' rows=3 cols=80></textarea></p>
<!-- HTML to handle submitting the HIT -->
<p><input type='submit' id='submitButton' value='Submit' /></p></form>
<script language='Javascript'>turkSetAssignmentID();</script>
</body>
</html>

谢谢

1 个答案:

答案 0 :(得分:1)

此消息“详细信息:prolog中不允许使用内容。”是线索。事实证明,这就是说你不能把内容放在预期的范围之外。当这里出现垃圾字符(想想智能引号或不可打印的ASCII值)时,通常会发生这种情况。这些可能是诊断对接的真正痛苦。

在你的情况下,它更容易调试,但仍然令人沮丧。看看这一行:

external_content = """"

事实证明,Python只需要三个引号(“”“)来确认多行字符串定义。因此,您的第四个”实际上是渲染为XML的一部分。将该行更改为:

external_content = """

你是金色的。我只是测试它,它的工作原理。抱歉所有的挫折感,但希望这可以解除你的阻碍。周日快乐!