神经网络中的维度问题

时间:2020-04-22 12:44:19

标签: python machine-learning neural-network

这就是我所拥有的

import numpy as np
import pandas as pd

df = pd.read_csv('spam.csv')

import nltk 
import string
from nltk.corpus import stopwords

def text_process(mess):
    nopunc = [char for char in mess if char not in string.punctuation]
    nopunc = ''.join(nopunc)
    return [word for word in nopunc.split() if word.lower() not in stopwords.words('english')]

df.head()
    Label   EmailText
0   ham Go until jurong point, crazy.. Available only ...
1   ham Ok lar... Joking wif u oni...
2   spam    Free entry in 2 a wkly comp to win FA Cup fina...
3   ham U dun say so early hor... U c already then say...


df['EmailText'].head(5).apply(text_process)

from sklearn.feature_extraction.text import CountVectorizer
bow_transformer = CountVectorizer(analyzer=text_process).fit(df['EmailText'])

messages_bow = bow_transformer.transform(df['EmailText'])
sparsity = (100.0 * messages_bow.nnz / (messages_bow.shape[0] * messages_bow.shape[1]))


from sklearn.feature_extraction.text import TfidfTransformer
tfidf_transformer = TfidfTransformer().fit(messages_bow)
messages_tfidf = tfidf_transformer.transform(messages_bow)

X = messages_tfidf
y = df['Label']

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#Neural network
def sigmoid(x):
    return 1.0 / 1.0 + np.exp(-x)

def derivative(x):
    return x * (1.0 - x)

dim1 = len(X_train[0])
dim2 = 100


np.random.seed(1)
weight0 = 2 * np.random.random((dim1, dim2)) - 1
weight1 = 2 * np.random.random((dim2, 1)) - 1

for epoch in range(2000):
    layer_0 = X_train
    layer_1 = sigmoid(np.dot(layer_0,weight0))
    layer_2 = sigmoid(np.dot(layer_1,weight1))

这是我遇到的错误,我无法继续

ValueError: shapes (4457,11304) and (1,100) not aligned: 11304 (dim 1) != 1 (dim 0)

我想知道如何设置Dimension(dim2),由什么决定该值,由于我遵循的是一个教程,所以我只是随机使用了“ 100”,但是我认为值dim2代表了某种东西。请任何人向我解释?

0 个答案:

没有答案