我认为我不会发现这个问题的类似版本,因为我觉得这是一个相对独特的问题,但如果我弄错了,请指出正确的方向。我正在处理以下需要转换为数据帧的向量:
myvec = structure(c(1.03, 2.3, -1.2, -0.09, -0.31, -0.51, 3.4, 3, 0.07,
0.02, 1.05, -0.02, 2.03), .Names = c("Intercept", "DEF-1017",
"DEF-1025", "DEF-103", "DEF-1043", "DEF-1046", "DEF-1048", "DEF-1076",
"OFF-1017", "OFF-1025", "OFF-103", "OFF-1046", "OFF-1076"))
head(myvec)
Intercept DEF-1017 DEF-1025 DEF-103 DEF-1043 DEF-1046
1.03 2.30 -1.20 -0.09 -0.31 -0.51
该矢量应该具有7个不同用户(用户1017,1025,103,1043,1046,1048,1076)的攻击性(OFF)和防御性(DEF)系数,但是对于两个用户而言存在令人反感的系数。我需要将其转换为具有4列的数据帧(防御ID,攻击ID,防御系数,攻击系数)。更具体地说,我想获得以下数据帧,以这种方式计算缺失值:
mydf = structure(list(DEFID = c("DEF-1017", "DEF-1025", "DEF-103", "DEF-1043",
"DEF-1046", "DEF-1048", "DEF-1076"), OFFID = c("OFF-1017", "OFF-1025",
"OFF-103", NA, "OFF-1046", NA, "OFF-1076"), DEFVAL = c(2.3, -1.2,
-0.09, -0.31, -0.51, 3.4, 3), OFFVAL = c(0.07, 0.02, 1.05, NA,
-0.02, NA, 2.03)), .Names = c("DEFID", "OFFID", "DEFVAL", "OFFVAL"
), row.names = c(NA, -7L), class = "data.frame")
mydf
DEFID OFFID DEFVAL OFFVAL
1 DEF-1017 OFF-1017 2.30 0.07
2 DEF-1025 OFF-1025 -1.20 0.02
3 DEF-103 OFF-103 -0.09 1.05
4 DEF-1043 <NA> -0.31 NA
5 DEF-1046 OFF-1046 -0.51 -0.02
6 DEF-1048 <NA> 3.40 NA
7 DEF-1076 OFF-1076 3.00 2.03
拦截值被删除/不包含在表中,其他所有内容都按预期格式化。非常感谢任何帮助,谢谢!
答案 0 :(得分:0)
我使用tidyr
包来完成这样的任务:
首先转换为数据帧格式:
library(tidyverse)
df <- data_frame(names= names(myvec),
values=myvec)
接下来过滤掉拦截,并使用tidyr
命令重新排列:
df %>% filter(names !="Intercept") %>%
extract(names, into=c("coeff", "user"), "([[:alnum:]]+)-([[:alnum:]]+)") %>%
spread(coeff, values)
# A tibble: 7 x 3
user DEF OFF
* <chr> <dbl> <dbl>
1 1017 2.30 0.07
2 1025 -1.20 0.02
3 103 -0.09 1.05
4 1043 -0.31 NA
5 1046 -0.51 -0.02
6 1048 3.40 NA
7 1076 3.00 2.03
如果您希望名称等与上面列出的完全相同,则只需进一步处理:
df %>% filter(names !="Intercept") %>%
extract(names, into=c("coeff", "user"), "([[:alnum:]]+)-([[:alnum:]]+)") %>%
spread(coeff, values) %>%
mutate(DEFID = paste("DEF", user, sep="-"),
OFFID = paste("OFF", user, sep="-")) %>%
rename(DEFVAL=DEF,
OFFVAL=OFF) %>%
select(DEFID, OFFID, DEFVAL, OFFVAL)
# A tibble: 7 x 4
DEFID OFFID DEFVAL OFFVAL
<chr> <chr> <dbl> <dbl>
1 DEF-1017 OFF-1017 2.30 0.07
2 DEF-1025 OFF-1025 -1.20 0.02
3 DEF-103 OFF-103 -0.09 1.05
4 DEF-1043 OFF-1043 -0.31 NA
5 DEF-1046 OFF-1046 -0.51 -0.02
6 DEF-1048 OFF-1048 3.40 NA
7 DEF-1076 OFF-1076 3.00 2.03
答案 1 :(得分:0)
这正是你想要的。我使用了+--------------------------------------------------+
| | TEXT |
| IMAGE |--------------------------------------|
| | TEXT | TEXT |
+--------------------------------------------------+
,split
和substr
。而且我认为这是一种最简单的方法,可以提供您想要的输出。
merge