我开始学习R并且我得到了循环错误。
这是我得到的错误:
Error in names(data1)[5] <- "TR" :
'names' attribute [5] must be the same length as the vector [4]
这是我的代码:
#read data
data1<-read.table("C_0.txt",header = T,sep=",")
#name the 5th col TR
names(data1)[5]<-"TR"
#calculate the length of data1
n<-nrow(data1)
#initialize first TR value to NA
data1[1,5]<-NA
for (i in 1:(n-1)){
if (data1[i,3]==data1[i+1,3]) {data1[i+1,5]<-data1[i,5]}
if (data1[i,3]< data1[i+1,3]) {data1[i+1,5]<- 1}
if(data1[i,3]> data1[i+1,3]) {data1[i+1,5]<- -1}
}
这是我想编写的算法: 如果当前价格高于之前的价格,则在名为TR的列中标记+1 如果当前价格低于先前价格,则在名为TR的列中标记为-1 如果当前价格与之前的价格相同,则在名为TR
的列中标记与之前价格相同的价格我将第一个TR行标记为NA,因为没有价格与之比较。
这里是来自C_0.txt的数据:
Date,Time,Price,Size
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:14,49.6,200
02/18/2014,05:06:14,49.6,193
02/18/2014,05:44:41,49.62,100
02/18/2014,06:26:36,49.52,100
02/18/2014,06:26:36,49.52,500
02/18/2014,07:09:29,49.6,100
02/18/2014,07:56:40,49.56,300
02/18/2014,07:56:40,49.55,400
02/18/2014,07:56:41,49.54,200
02/18/2014,07:56:43,49.55,100
02/18/2014,07:56:43,49.55,100
02/18/2014,07:56:50,49.55,100
02/18/2014,07:57:12,49.53,100
02/18/2014,07:57:12,49.51,2200
02/18/2014,07:57:12,49.51,100
02/18/2014,07:57:12,49.5,200
非常感谢!
答案 0 :(得分:2)
向数据框添加列是相当基本的东西。这是一个总结了这个问题的问题。
您无法通过更改名称属性向量的长度来向数据框添加列。您必须通过this question中的方法创建它。
至于问题的其他部分,我们使用diff
和rle
代替for-loop。
d <- read.table(sep = ",", text =
"Date,Time,Price,Size
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:13,49.6,200
02/18/2014,05:06:14,49.6,200
02/18/2014,05:06:14,49.6,193
02/18/2014,05:44:41,49.62,100
02/18/2014,06:26:36,49.52,100
02/18/2014,06:26:36,49.52,500
02/18/2014,07:09:29,49.6,100
02/18/2014,07:56:40,49.56,300
02/18/2014,07:56:40,49.55,400
02/18/2014,07:56:41,49.54,200
02/18/2014,07:56:43,49.55,100
02/18/2014,07:56:43,49.55,100
02/18/2014,07:56:50,49.55,100
02/18/2014,07:57:12,49.53,100
02/18/2014,07:57:12,49.51,2200
02/18/2014,07:57:12,49.51,100
02/18/2014,07:57:12,49.5,200", header = TRUE)
tmp <- c(NA, diff(d$Price))
tmp <- sign(tmp)
tmp <- rle(tmp)
l <- tmp$lengths
v <- tmp$values
idx <- which(v == 0L)
l[idx - 1] <- l[idx - 1] + l[idx]
l <- l[-idx]
v <- v[-idx]
d$TR <- inverse.rle(list(lengths = l, values = v))
# Date Time Price Size TR
# 1 02/18/2014 05:06:13 49.60 200 NA
# 2 02/18/2014 05:06:13 49.60 200 NA
# 3 02/18/2014 05:06:13 49.60 200 NA
# 4 02/18/2014 05:06:14 49.60 200 NA
# 5 02/18/2014 05:06:14 49.60 193 NA
# 6 02/18/2014 05:44:41 49.62 100 1
# 7 02/18/2014 06:26:36 49.52 100 -1
# 8 02/18/2014 06:26:36 49.52 500 -1
# 9 02/18/2014 07:09:29 49.60 100 1
# 10 02/18/2014 07:56:40 49.56 300 -1
# 11 02/18/2014 07:56:40 49.55 400 -1
# 12 02/18/2014 07:56:41 49.54 200 -1
# 13 02/18/2014 07:56:43 49.55 100 1
# 14 02/18/2014 07:56:43 49.55 100 1
# 15 02/18/2014 07:56:50 49.55 100 1
# 16 02/18/2014 07:57:12 49.53 100 -1
# 17 02/18/2014 07:57:12 49.51 2200 -1
# 18 02/18/2014 07:57:12 49.51 100 -1
# 19 02/18/2014 07:57:12 49.50 200 -1