我正致力于将arduino帖子分类为硬件和软件类别。我已经手动准备了火车组。 但是,在进入测试集时,所有帖子都被预测为“硬件”。 列车集格式是否存在错误。 NaiveBayes是否无法将句子识别为执行预测的输入? 列车集格式为:class“\ t”pred“\ t”设置 分类器将使用set列标识标签,将pred列标识为谓词。 Class列仅用于创建set列。
//programmed in R
library(e1071)
train = read.table("train_set.csv", sep="\t", header=T)
test = read.table("test_one.csv", sep="\t", header=T)
train$set = "Hardware"
train[train$class==0,]$set = "Software"
train$set = as.factor(train$set)
model <- naiveBayes(set ~ pred, data = train)
pred <- predict(model, train[495:510,]) //displays train set prediction
pred1 <- predict(model, test[1:10,]) //displays incorrect prediction for test set
训练数据集(分隔符= \ t,仅附加4行1000行)
1代表硬件 0代表软件 在程序中,附加了一个名为“set”的列来存储对应于1和0的“硬件”或“软件”。
class pred
1 Im making a simple Arduino web server and I want to keep it turned on all the time. So it must endure to stay working continuously. Im using an Arduino Uno with a Ethernet Shield.Its powered with a simple outlet power supply 5V @ 1A. My Questions: Will I have any problems leaving the Arduino turned on all the time? Is there some other Arduino board better recommended for this? Are there any precautions that I need to heed regarding this?
1 Put plainly: is there a way to get an HTTPS connection on the Arduino? I have been looking in to it and I have found it is impossible with the standard library and the Ethernet shield but is there a custom library that can do it? What about a coprocessor i.e. like the WiFi shield has? Anyone know if the Arduino yn has ssl?
0 The use of malloc and free seems pretty rare in the Arduino world. It is used in pure AVR C much more often but still with caution. Is it a really bad idea to use malloc and free with Arduino?
0 What do I need to build a shield capable of receiving 1080p video from USB camera timestamp each frame and send the frame to memory card?
测试数据集
pred
arduino-uno web-server ethernet i'm making a simple arduino web server and i want to keep it turned on all the time. so it must endure to stay working continuously. i'm using an arduino uno with a ethernet shield.it's powered with a simple outlet power supply 5v @ 1a. my questions: will i have any problems leaving the arduino turned on all the time? is there some other arduino board better recommended for this? are there any precautions that i need to heed regarding this?
I made a circuit which in my intentions would allow me to toggle a LED dimming loop. Problem is that once I push the button the first time pushing it a second time doesnt toggle the LED loop off. Here is the code: const int LED = 9; // the pin for the LEDconst int BUTTON = 7;int val = LOW;int old_val = LOW;int state = 0;int i = 0;void setup{ pinModeLED OUTPUT; pinModeBUTTON INPUT;}void loop{ val = digitalReadBUTTON; if val == HIGH && old_val==LOW { state = 1 - state; delay10; } old_val = val; if state == 1 { for i = 0; i < 255; i++ // loop from 0 to 254 fade in { analogWriteLED i; // set the LED brightness delay10; // Wait 10ms because analogWrite // is instantaneous and we would // not see any change } for i = 255; i > 0; i-- // loop from 255 to 1 fade out { analogWriteLED i; // set the LED brightness delay10; // Wait 10m
预期产出: 硬件软件
答案 0 :(得分:0)
library(e1071)
library(tm)
library(MASS)
library(SnowballC)
train = read.table("train_set.csv", sep="\t", header=T)
test = read.table("test_set.csv", sep="\t", header=T)
#stopwords
mystopwords <- c(stopwords("english"),"week","arduino","words","need","get","will","want","know","work","also")
#corpus for train set
train.corpus <- Corpus(VectorSource(train$pred))
train.corpus <- tm_map(train.corpus, content_transformer(tolower))
train.corpus <- tm_map(train.corpus, removePunctuation)
train.corpus <- tm_map(train.corpus, stripWhitespace)
train.corpus <- tm_map(train.corpus, removeNumbers)
train.corpus <- tm_map(train.corpus, removeWords, mystopwords)
train.corpus <- tm_map(train.corpus, stemDocument)
train.corpus <- tm_map(train.corpus, removeWords, "(http)\\w+")
train.corpus <- tm_map(train.corpus, removeWords, "\\b[a-zA-Z0-9]{10,100}\\b")
train.corpus.dtm <- DocumentTermMatrix(train.corpus, control = list(weighting = function(x) weightTfIdf(x, normalize = FALSE), stopwords = TRUE, removePunctuation=TRUE))
train.corpus.dtms <- removeSparseTerms(train.corpus.dtm, 0.98)
#Debugging
#TermDocumentMatrix(train.corpus)
#inspect(train.corpus.dtm)
#findFreqTerms(train.corpus.dtm, N) #N <- freq
#corpus for test set
test.corpus <- Corpus(VectorSource(test$pred))
test.corpus <- tm_map(test.corpus, content_transformer(tolower))
test.corpus <- tm_map(test.corpus, removePunctuation)
test.corpus <- tm_map(test.corpus, stripWhitespace)
test.corpus <- tm_map(test.corpus, removeNumbers)
test.corpus <- tm_map(test.corpus, removeWords, mystopwords)
test.corpus <- tm_map(test.corpus, stemDocument)
test.corpus <- tm_map(test.corpus, removeWords, "(http)\\w+")
test.corpus <- tm_map(test.corpus, removeWords, "\\b[a-zA-Z0-9]{10,100}\\b")
test.corpus.dtm <- DocumentTermMatrix(test.corpus, control = list(weighting = function(x) weightTfIdf(x, normalize = FALSE), stopwords = TRUE, removePunctuation=TRUE))
test.corpus.dtms <- removeSparseTerms(test.corpus.dtm, 0.98)
m <- as.matrix(train.corpus.dtms)
n <- as.matrix(test.corpus.dtms)
#Train model
model <- naiveBayes(m,as.factor(train$class));
#Prediction
results <- predict(model,n[1:10,])
下一步是在此分类器中包含10倍交叉验证以进行性能检查;我现在被困住了。