无监督聚类的神经网络实现

时间:2021-04-17 14:01:52

标签: python tensorflow

我对神经网络比较陌生,所以我试图将它用于无监督聚类。我的数据在 dataframe 中有 5 个不同的列(特征),我想从中得到 4 个类,请参阅下面的完整模型


from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import log_loss
from sklearn.metrics import precision_recall_curve, average_precision_score
from sklearn.metrics import roc_curve, auc, roc_auc_score

import keras
from keras import backend as K
from keras.models import Sequential, Model
from keras.layers import Activation, Dense, Dropout , Flatten
from keras.layers import BatchNormalization, Input, Lambda
from keras import regularizers
from keras.losses import mse, categorical_crossentropy

model = Sequential()
model.add(Dense(32, activation='relu',input_shape=[5]))
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=4, activation='relu'))
model.add(Dense(4, activation = "softmax"))
model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])

当我提供生成 4 个类的选项时,我收到错误消息:

<块引用>

ValueError: Shapes (None, 5) 和 (None, 4) 不兼容

我不知道我做错了什么。我尝试使用不同的损失函数,同样的错误。

当我输入数据时出现错误,

out_class = model.fit(x=pd_pca_std,
                      y=pd_pca_std,
                      epochs=num_epochs,
                      batch_size=batch_size,
                      shuffle=True,
                      validation_data=(pd_pca_std, pd_pca_std),
                      verbose=1)

值是

batch_size = 33
epochs = 20
num_classes = 4
input_shape = (990000, 5)
output_shape = (990000, 4)

2 个答案:

答案 0 :(得分:1)

我建议使用 5 个类或与 5 个类相关的东西。我来解释一下。

因此,在一般的神经网络和机器学习中,某些矩阵运算会在 TensorFlow 的后台发生。所以说我创建了以下内容:

import numpy as np

x = np.random.random((3, 4))
y = np.random.random((3, 3))

np.dot(x, y)  # if I try multiplying 2 incompatible matrices, the program will fail :(

所以这里发生的事情是矩阵与简单矩阵算术不兼容,因为它们需要具有某些形状才能兼容。所以我建议做的是要么改变有问题的矩阵/数组的形状,要么在程序中使用不同的形状,看看哪个会成功......

您还可以学习一些线性代数,其中包含矩阵操作和算术规则,但我现在不会深入研究。但是,我要做的是留下一个链接供您查看有关此主题的信息,以便您知道将来要做什么...

这里是: https://www.mathlynx.com/online/LinAlg_Matrices_rules

希望这有助于... 祝你有美好的一天:)

答案 1 :(得分:0)

这是我如何复制你的问题并让它工作的骨架

from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import log_loss
from sklearn.metrics import precision_recall_curve, average_precision_score
from sklearn.metrics import roc_curve, auc, roc_auc_score

import numpy as np

import keras
from keras import backend as K
from keras.models import Sequential, Model
from keras.layers import Activation, Dense, Dropout , Flatten
from keras.layers import BatchNormalization, Input, Lambda
from keras import regularizers
from keras.losses import mse, categorical_crossentropy

X = '''input data here as an array''' # I used X = np.zeros((990000, 5))
y = '''output data here as an array'''#I used y = np.ones((990000, 4))

batch_size = 33 
num_epochs = 20 
num_classes = 4

model = Sequential()
model.add(Dense(32, activation='relu',input_shape=X.shape[1:])) #Input shape = 5
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=4, activation='relu'))
model.add(Dense(y.shape[1], activation = "softmax")) #Output = y.shape[1] = 4
model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])

model.summary() #Will show you a summary of the model

model.fit(x=X, y=y,epochs=num_epochs, batch_size=batch_size, shuffle=True,validation_data=(X,y),verbose=1) #You may want to use different variables in your validation.