matplotlib绘制数据问题

时间:2016-11-24 11:55:02

标签: python matplotlib data-science

我刚刚完成了对我正在工作的项目的预测。我想尝试使用图形进行一些可视化,但是我很难找到合适的图表,我拥有的数据非常大。我会留下我的代码,并在底部的一列中显示我的结果。这只是1列,我想首先绘制1列,看看它是如何工作的。香港专业教育学院尝试使用条形图,它出来有点奇怪,只有一个完整的蓝色条。所以我不确定哪种图表对这种信息有好处。

读取测试和目标数据

训练和测试必须符合列

train = pd.read_csv(' C:/Users/Michael/Desktop/train.csv/train.csv' ;, parse_dates = [' Dates']) test = pd.read_csv(' C:/Users/Michael/Desktop/test.csv/test.csv' ;, parse_dates = [' Dates'])

# TRAINING data
#Convert crime labels to numbers
df_crime = preprocessing.LabelEncoder()
crime = df_crime.fit_transform(train.Category)
#Get binarized weekdays, districts, and hours using dummy variables
days = pd.get_dummies(train.DayOfWeek)
district = pd.get_dummies(train.PdDistrict)
hour = train.Dates.dt.hour
hour = pd.get_dummies(hour)
#Build new array
train_data = pd.concat([hour, days, district], axis=1)
train_data['crime']=crime
#train_data.head()

#Repeat for test data
days = pd.get_dummies(test.DayOfWeek)
district = pd.get_dummies(test.PdDistrict)

hour = test.Dates.dt.hour
hour = pd.get_dummies(hour) 

test_data = pd.concat([hour, days, district], axis=1)

features = ['Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday', 'Tuesday',
 'Wednesday', 'BAYVIEW', 'CENTRAL', 'INGLESIDE', 'MISSION',
 'NORTHERN', 'PARK', 'RICHMOND', 'SOUTHERN', 'TARAVAL', 'TENDERLOIN']

training, testing = train_test_split(train_data, train_size=.60) 





#bernoulliNB
# predicting only on the training data
model_B = BernoulliNB()
model_B.fit(training[features], training['crime'])
predicted2 = np.array(model_B.predict_proba(testing[features]))
log_loss(testing['crime'], predicted2)
# predictingon the test data, using bernoulli model
predicted3 = model_B.predict_proba(test_data[features])

#Write results
result=pd.DataFrame(predicted3, columns=df_crime.classes_)

# this is an example of 1 of my columns that i would like to graph
    result['SUICIDE']
0         0.000432
1         0.000432
2         0.000760
3         0.000903
4         0.000903
5         0.001089
6         0.000903
7         0.000903
8         0.000550
9         0.000744
10        0.000903
11        0.000550
12        0.000550
13        0.000744
14        0.000744
15        0.000219
16        0.001089
17        0.000903
18        0.000760
19        0.000760
20        0.000760
21        0.000550
22        0.000744
23        0.000903
24        0.000760
25        0.000787
26        0.000760
27        0.000265
28        0.000903
29        0.001089

1 个答案:

答案 0 :(得分:0)

你对输出的期望非常模糊,但我认为你应该查看seaborn包,特别是关于visualising univariate datasets的教程部分,它应该给你几个例子以及你可以做些什么来实现产出的可视化。