如何在已部署的模型上测试看不见的数据?
看不见的数据具有csv格式的近20个特征。
我发现大多数教程都使用情感分析或电影标题来证明这一点,基本上只传递了一个句子。
一个例子是:
test_review = "Nothing but a disgusting materialistic pageant of glistening abed remote control greed zombies, totally devoid of any heart or heat. A romantic comedy that has zero romantic chemestry and zero laughs!"
test_words = review_to_words(test_review)
print(test_words)
def bow_encoding(words, vocabulary):
bow = [0] * len(vocabulary) # Start by setting the count for each word in the vocabulary to zero.
for word in words.split(): # For each word in the string
if word in vocabulary: # If the word is one that occurs in the vocabulary, increase its count.
bow[vocabulary[word]] += 1
return bow
test_bow = bow_encoding(test_words, vocabulary)
print(test_bow)
len(test_bow)
xgb_predictor = xgb.deploy(initial_instance_count = 1, instance_type = 'ml.m4.xlarge')
import boto3
runtime = boto3.Session().client('sagemaker-runtime')
xgb_predictor.endpoint
response = runtime.invoke_endpoint(EndpointName = xgb_predictor.endpoint
ContentType = 'text/csv',
Body = test_bow)
response = runtime.invoke_endpoint(EndpointName = xgb_predictor.endpoint,
ContentType = 'text/csv',
Body = ','.join([str(val) for val in test_bow]).encode('utf-8'))
print(response)
response = response['Body'].read().decode('utf-8')
print(response)
当我要测试的数据不仅仅是一个句子,而是几个功能时,我将如何进行测试?
谢谢