在R中有一百万个和一个以上的教学数据争论和组织的站点,但我不确定哪个最有效,因为我的问题/我知道如何在python中轻松地做到这一点,但是什么是等效的在R中这样做的简单方法?
比如说,我有一个如下所示的数据框:
ROI no season value
a 1 summer 81.33328
a 2 summer 15.34663
...
但是我想重新安排列,使它看起来像这样:
library(stringr)
df$new <- str_split_fixed(dat$ROI, "_", 2)
等等
到目前为止,我有这个:
List<Integer> intList = new ArrayList<>();
//example list [0, 20, 10, 9, 11, 7, 9, 14]
List<Integer> result = new ArrayList<>();
for (int i=0; i < intList.size()-1; i++) {
for (int j=i+1; j < intList.size(); j++) {
if (intList.get(j) > intList.get(i)) {
result.add(intList.get(i));
break;
}
i++;
}
}
System.out.println(result);
我怎样才能最好地接近这个?
答案 0 :(得分:1)
我们可以使用tidyverse
library(tidyverse)
#split the 'ROI' into two columns
res <- separate(df, ROI, into = c("ROI", 'no'), convert = TRUE) %>%
#reshape from wide to long format
gather(season, value, summer_1:winter_2) %>%
#split the season column into two
separate(season, into = c('season', 'n')) %>%
#remove the columns that are not needed
select(-n)
head(res)
# ROI no season value
#1 a 1 summer 29.25740
#2 a 2 summer 22.48911
#3 a 3 summer 70.42230
#4 b 1 summer 51.88971
#5 b 2 summer 66.26196
#6 b 3 summer 92.04438
或者其他选项是使用cSplit
拆分列,使用melt
中的data.table
将其转换为“长”格式
library(splitstackshape)
res2 <- setnames(melt(cSplit(df, "ROI", sep="_"), id.var = c("ROI_1", "ROI_2"),
variable.name = "season"), 1:2, c("ROI", "no"))[, season := sub("_\\d+", "", season)][]
head(res2)
# ROI no season value
#1: a 1 summer 29.25740
#2: a 2 summer 22.48911
#3: a 3 summer 70.42230
#4: b 1 summer 51.88971
#5: b 2 summer 66.26196
#6: b 3 summer 92.04438
set.seed(24)
ROI <- c("a_01","a_02","a_03","b_01","b_02","b_03")
summer_1 <- runif(6, min=0, max=100)
winter_1 <- runif(6, min=0, max=100)
summer_2 <- runif(6, min=0, max=100)
winter_2 <- runif(6, min=0, max=100)
df <- data.frame(ROI,summer_1,winter_1,summer_2,winter_2)