Question

我是python中的新手并尝试使用preprocessing.normalize规范化列表中的每个索引。但是，它在ValueError: setting an array element with a sequence.

时出错

然后，我发现了问题所在。这是因为length(size)中每个索引的np.array不同。

这是我的代码，

result = []

for url in target_url :
    sensor = pd.read_csv(url, header=None, delimiter=r"\s+")
    result.append(sensor[2])

result = np.array(result)
# I want to resample here before it goes to normalize.
result = preprocessing.normalize(result, norm='l1')

我有target_url从网络服务器获取传感器数据，每个都附加到result列表。然后，它使用np.array

转换为数组

例如，

我len(result[0])有121598而len(result[1])有1215601。我想使用resample填充NaN，使result[0]的长度与result[1]相同。

我该怎么做？

请帮帮我。

提前致谢。

修改

规范化后，我尝试使用corr()

进行关联

这是代码，

result = preprocessing.normalize(result, norm='l1')
ret = pd.DataFrame(result)
corMat = DataFrame(ret.T.corr())

Answer 1

由于您使用pip freeze | grep goog gapic-google-cloud-pubsub-v1==0.15.4 google-auth==1.0.2 google-cloud-core==0.27.1 google-cloud-pubsub==0.28.2 google-gax==0.15.14 googleapis-common-protos==1.5.2 grpc-google-iam-v1==0.11.1 proto-google-cloud-pubsub-v1==0.15.4来阅读csv，因此您将有一个良好的开端。一种方法是使用pandas，将pd.concat列表中的系列（我假设sensor[2]是系列）加入一个result。这是一个例子：

DataFrame

给出了：

a = [pd.Series([1, 2, 3]), pd.Series([1, 2]), pd.Series([1, 2, 3, 4])]
pd.concat(a, axis=1)

在OP提供的示例中，这应该足够了：

     0    1  2
0  1.0  1.0  1
1  2.0  2.0  2
2  3.0  NaN  3
3  NaN  NaN  4

根据系列索引的外观和应用程序，您可以执行不同类型的连接。这是docs。

如何在python中重新采样具有最大索引长度的numpy数组

1 个答案: