我试图在我的Spyder IDE上运行this kaggle kernel。因为我没有使用
Jupyter笔记本,我无法使用%matplotlib inline
,但是,我确信它与我的问题无关......
我读取数据并使用seaborn
绘制一次,如内核中所示,并得到预期的输出:
# Load the data
train = pd.read_csv("./input/train.csv")
test = pd.read_csv("./input/test.csv")
Y_train = train["label"]
# Drop 'label' column
X_train = train.drop(labels = ["label"],axis = 1)
# free some space
del train
# print and plot digit count
g = sns.countplot(Y_train)
#print (Y_train.value_counts())
我在内核中添加了下一行:
# Normalize the data
X_train = X_train / 255.0
test = test / 255.0
# Reshape image in 3 dimensions (height = 28px, width = 28px , canal = 1)
X_train = X_train.values.reshape(-1,28,28,1)
test = test.values.reshape(-1,28,28,1)
# Encode labels to one hot vectors (ex : 2 -> [0,0,1,0,0,0,0,0,0,0])
Y_train = to_categorical(Y_train, num_classes = 10)
# Set the random seed
random_seed = 2
# Split the train and the validation set for the fitting
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
# Some examples
g = plt.imshow(X_train[0][:,:,0])
内核的输出是:
但出于某种原因我的是:
我不明白为什么它会改变我的原始图像(现在只显示挤压的图像)并且不显示数字图像
到目前为止,这是我的整个代码(删除了%matplotlib inline
行及其相关内容(添加print
s)):
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import itertools
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.optimizers import RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
np.random.seed(2)
sns.set(style='white', context='notebook', palette='deep')
def system_info():
print ('keras: %20s' % keras.__version__)
print ('numpy: %20s' % np.__version__)
print ('pandas: %19s' % pd.__version__)
print ('seaborn: %18s' % sns.__version__)
# Load the data
train = pd.read_csv("./input/train.csv")
test = pd.read_csv("./input/test.csv")
Y_train = train["label"]
# Drop 'label' column
X_train = train.drop(labels = ["label"],axis = 1)
# free some space
del train
# print and plot digit count
g = sns.countplot(Y_train) # <------ 1st plot. Works fine if the code ends here
#print (Y_train.value_counts())
# Check the data - check for missing data
#print(X_train.isnull().any().describe())
#print(test.isnull().any().describe())
# Normalize the data
X_train = X_train / 255.0
test = test / 255.0
# Reshape image in 3 dimensions (height = 28px, width = 28px , canal = 1)
X_train = X_train.values.reshape(-1,28,28,1)
test = test.values.reshape(-1,28,28,1)
# Encode labels to one hot vectors (ex : 2 -> [0,0,1,0,0,0,0,0,0,0])
Y_train = to_categorical(Y_train, num_classes = 10)
# Set the random seed
random_seed = 2
# Split the train and the validation set for the fitting
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
# Some examples
g = plt.imshow(X_train[0][:,:,0]) # <------- 2nd plot. Removing this gives me back the bar table, but not the digit image expected
答案 0 :(得分:1)
您正在将一个地块绘制在另一个地块之上。尝试以下方面:
# beginning of your code here...
# open new figure and make first plot
plt.figure()
g = sns.countplot(Y_train)
# rest of your code here
# open second figure and make second plot
plt.figure()
s = plt.imshow(X_train[0][:,:,0])
# showing your plots will show both plots separately
plt.show()