我的问题是关于b栏的标准化。我需要这些数据采用一种格式,以便更容易构建图形。
plot(embed(x, 2)[, 2:1])
作为输出:
a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant", "Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen")
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")
ab <- cbind.data.frame(a,b)
a b
1 Jackson Brice / The Shocker 2:30
2 Flash Thompson 2:15
3 Mr. Harrington 2
4 Mac Gargan 1:15
5 Betty Brant 1:15
6 Ann Marie Hoag 1
7 Steve Rogers / Captain America 1
8 Pepper Potts :45
9 Karen v
如果可能,列b的对象采用可操作的时间格式。
答案 0 :(得分:1)
因此,我必须对您要做的事情做出一些假设,例如:单位和你想要用字符值做什么,但希望这个函数能给你一些工作。
随着时间的推移,最大的挑战是在从文本中解析时需要一些相当清晰的规则。由于我的结果,我不得不在函数中放置一些if
语句以使其工作,但只要有可能,尽量保持时间格式尽可能一致。
library(lubridate)
formatTime <- function(x) {
# Check for a : seperator in the text
if(grepl(":",x, fixed = TRUE)) {
y <- unlist(strsplit(x,":", fixed = TRUE))
# If there is no value before the : then add "00" before the :
if(y[1]=="") {
z <- ms(paste("00",y[2],collapse = ":"), quiet=TRUE)
} else {
z <- ms(paste(y,collapse = ":"), quiet=TRUE)
}
} else {
# If there is no : then add "00" after the :
z <- ms(paste(x,"00",collapse = ":"), quiet=TRUE)
}
# If it did not pare with ms, i.e. it was a character, then assign zero time "00:00"
if(is.na(z)) z <- ms("0:00")
# Converted to duration due to issues returning period with lapply.
# Make dataframe to retun units and name with lapply.
return(data.frame(time = as.duration(z)))
}
# Convert factor variable to character
ab$b <- as.character(ab$b)
ab <- cbind(ab,rbindlist(lapply(ab$b,formatTime)))
我开始尝试使用一段时间但是它不会使用apply语句正确返回,所以我转换为持续时间。这可能与您的示例显示的不同,但它应该与图表一起使用 如果我错过了您的需求,请告诉我,我会更新答案。
答案 1 :(得分:0)
可以实现使用tidyr::separate
和tidyr::unite
的解决方案。方法是首先将包含alphabetic
的值替换为00:00:00
。将3个部分分开。使用dplyr::mutate_at
所有3列都将更改为00
格式。最后,统一所有三列。
library(tidyverse)
ab %>% mutate_if(is.factor, as.character) %>% #Change any factor in character
mutate(b = ifelse(grepl("[[:alpha:]]", b), "00:00:00", b)) %>%
mutate(b = ifelse(grepl(":", b), b, paste(b,"00",sep=":")) ) %>%
separate(b, into = c("b1", "b2", "b3"), sep = ":", fill="left", extra = "drop") %>%
mutate_at(vars(starts_with("b")),
funs(sprintf("%02d", as.numeric(ifelse(is.na(.) | . == "",0,.))))) %>%
unite("b", starts_with("b"), sep=":")
# a b
# 1 Jackson Brice / The Shocker 00:02:30
# 2 Flash Thompson 00:02:15
# 3 Mr. Harrington 00:02:00
# 4 Mac Gargan 00:01:15
# 5 Betty Brant 00:01:15
# 6 Ann Marie Hoag 00:01:00
# 7 Steve Rogers / Captain America 00:00:55
# 8 Pepper Potts 00:00:45
# 9 Karen 00:00:00
数据:强>
a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant",
"Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen")
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")
ab <- cbind.data.frame(a,b