Question

我有点困惑。我在数据框中有这样的数据

    index  times
1       1  56.60
2       1 150.75
3       1 204.41
4       2  44.71
5       2  98.03
6       2 112.20

我知道索引1的时间是有偏见的，而索引的时间则不是。我需要创建一个数据框的副本，从索引的样本中删除偏差。我一直在尝试几种apply，by和like的组合。我得到的最接近的是

by(lct, lct$index, function(x) { if(x$index == 1) x$times = x$times-50 else x$times = x$times } )

返回了类by的对象，这对我来说无法使用。我需要以与原始文件相同的格式（索引，时间）将数据写回csv文件。想法？

Answer 1

这样的事情应该有效：

df$times[df$index ==1] <- df$times[df$times == 1] - 50

这里的技巧是采用适合你的过滤器的df$times子集，并意识到R也可以分配给一个子集。

或者，您可以使用ifelse：

df$times = ifelse(df$index == 1, df$times - 50, df$times)

并在dplyr中使用它：

library(dplyr)
df = data.frame(index = sample(1:5, 100, replace = TRUE), 
                value = runif(100)) %>% arrange(index)
df %>% mutate(value = ifelse(index == 1, value - 50, value))
#  index     value
#1     1 -49.95827
#2     1 -49.98104
#3     1 -49.44015
#4     1 -49.37316
#5     1 -49.76286
#6     1 -49.22133
#etc

Answer 2

怎么样，

index <- c(1, 1, 1, 2, 2, 2)
times <- c(56.60, 150.75, 204.41, 44.71, 98.03, 112.20)
df <- data.frame(index, times)
df$times <- ifelse(df$index == 1, df$times - 50, df$times)


> df
#index  times
#1     1   6.60
#2     1 100.75
#3     1 154.41
#4     2  44.71
#5     2  98.03
#6     2 112.20

如何在数据框中按行操作数据

2 个答案: