鼠尾草中protobuf的问题

时间:2019-07-02 15:09:57

标签: python protocol-buffers amazon-sagemaker factorization

我正在尝试使用Sagemaker中的分解机算法,但这给了我ValueError

最初是用于稀疏矩阵,但我的数据不是,所以我将其更改得太密

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import lil_matrix
Here is the code:


train_key      = 'train.protobuf'
train_prefix   = '{}/{}/'.format(prefix, 'train')
test_key       = 'test.protobuf'
test_prefix    = '{}/{}/'.format(prefix, 'test')
output_prefix  = '{}/{}/output'.format(bucket, prefix)
def writeDatasetToProtobuf(X, y, bucket, prefix, key):
    import io,boto3
    import sagemaker.amazon.common as smac
    buf = io.BytesIO()
    smac.write_numpy_to_dense_tensor(buf, X.astype('float32'), y.astype('float32'))
    buf.seek(0)
    print (buf)
    obj = '{}/{}'.format(prefix, key)
    boto3.resource('s3').Bucket(bucket).Object(obj).upload_fileobj(buf)
    print('Wrote dataset: {}/{}'.format(bucket,obj))

writeDatasetToProtobuf(X_train.astype('float32'), y_train.astype('float32'), bucket, train_prefix, train_key)    
writeDatasetToProtobuf(X_test.astype('float32'), y_test.astype('float32'), bucket, test_prefix, test_key)    

print('Output: {}'.format(output_prefix))

ValueError:标签必须是向量

1 个答案:

答案 0 :(得分:0)

如果y_train和y_test的形状具有多个尺寸,则会发生此错误。

尝试使用以下方法将标签设为向量:

...
y = y.reshape(-1,)
smac.write_numpy_to_dense_tensor(buf, X.astype('float32'), y.astype('float32'))
...