重置r,大数据集中变量的值

时间:2018-05-24 07:52:33

标签: r

我试图找到一种在另一个变量上写一个新变量条件的简短方法。更具体地说,假设变量“x”对于每个字母A到G有5个类别,即getenv_exn' in ATS's own libraries. So the problem is not "how can I get an environment variable in ATS program?", but really "how can I *properly implement getenv()* in ATS". I'm given a bare pointer. How do I tell ATS about its properties, so that I can work with it in a legal manner? I also don't want to write C-in-ATS with总共35个类别,并且想要创建一个新变量“y”,其中整数从1到35的条件变量“x”。

这就是我做的事情

A1,A2,...,A5, B1,B2,...,B5,...,G5

2 个答案:

答案 0 :(得分:0)

评论摘要:

在您的x类别按照您想要枚举的方式进行排序的方案中,使用as.numeric(as.factor())

# example data
df <- data.frame(x = as.vector(sapply(LETTERS[1:7], paste0, 1:5)))

# new variable
df$y <- as.numeric(as.factor(df$x))
# note that you'd need to wrap it in as.character() if you want your numbers to be characters, 
# not integers

如果您的数据未排序,则可以使用car::Recode()

# install package if not already present
install.packages("car")

# new variable
df$z <- car::Recode(df$x, paste(paste0("'", levels(df$x), "' = '", 1:35, "'"), collapse = "; "))
# if you want your numbers to be integers, not characters, use this for the paste0():
# paste0("'", levels(df$x), "' = ", 1:35)

输出:

> df
    x  y  z
1  A1  1  1
2  A2  2  2
3  A3  3  3
4  A4  4  4
5  A5  5  5
6  B1  6  6
7  B2  7  7
8  B3  8  8
9  B4  9  9
10 B5 10 10
11 C1 11 11
12 C2 12 12
13 C3 13 13
14 C4 14 14
15 C5 15 15
16 D1 16 16
17 D2 17 17
18 D3 18 18
19 D4 19 19
20 D5 20 20
21 E1 21 21
22 E2 22 22
23 E3 23 23
24 E4 24 24
25 E5 25 25
26 F1 26 26
27 F2 27 27
28 F3 28 28
29 F4 29 29
30 F5 30 30
31 G1 31 31
32 G2 32 32
33 G3 33 33
34 G4 34 34
35 G5 35 35

示例代码的对象类:

> sapply(df, class)
        x         y         z 
 "factor" "numeric"  "factor" 

答案 1 :(得分:0)

根据@Roland,这段代码应该这样做:

# create your data frame, note that the variable automatically becomes a factor.
df <- data.frame(x = sort(paste0(rep(LETTERS[1:7],5), 1:5)))
str(df)

# Returns the factor level as an integer
df$y <- as.integer(df$x)

使用因素时需要注意的是,您应该检查您的等级是否正确(A1 - > 1,A2 - > 2等)。