我正在尝试使用Sagemaker中的分解机算法,但这给了我ValueError
最初是用于稀疏矩阵,但我的数据不是,所以我将其更改得太密
import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import lil_matrix
Here is the code:
train_key = 'train.protobuf'
train_prefix = '{}/{}/'.format(prefix, 'train')
test_key = 'test.protobuf'
test_prefix = '{}/{}/'.format(prefix, 'test')
output_prefix = '{}/{}/output'.format(bucket, prefix)
def writeDatasetToProtobuf(X, y, bucket, prefix, key):
import io,boto3
import sagemaker.amazon.common as smac
buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, X.astype('float32'), y.astype('float32'))
buf.seek(0)
print (buf)
obj = '{}/{}'.format(prefix, key)
boto3.resource('s3').Bucket(bucket).Object(obj).upload_fileobj(buf)
print('Wrote dataset: {}/{}'.format(bucket,obj))
writeDatasetToProtobuf(X_train.astype('float32'), y_train.astype('float32'), bucket, train_prefix, train_key)
writeDatasetToProtobuf(X_test.astype('float32'), y_test.astype('float32'), bucket, test_prefix, test_key)
print('Output: {}'.format(output_prefix))
ValueError:标签必须是向量
答案 0 :(得分:0)
如果y_train和y_test的形状具有多个尺寸,则会发生此错误。
尝试使用以下方法将标签设为向量:
...
y = y.reshape(-1,)
smac.write_numpy_to_dense_tensor(buf, X.astype('float32'), y.astype('float32'))
...