Question

我的数据集显示了153个观测值中的6个变量，所有变量保存在一栏中，如下所示：

Ozone.Solar.R.Wind.Temp.Month.Day
1                   41,190,7.4,67,5,1
2                     36,118,8,72,5,2
3                  12,149,12.6,74,5,3
4                  18,313,11.5,62,5,4
5                   NA,NA,14.3,56,5,5

我现在正在寻找一种将这些变量分为6个不同的列的方法，最后应该看起来像这样：

Ozone Solar Wind Temp Month Day
41    190   7.4  67   5     1    
36    118   8    72   5     2  
12    149   12.6 74   5     3  
18    313   11.5 62   5     4  
NA    NA    14.3 56   5     5

在此先感谢您的帮助！

Answer 1

我们可以通过以下方式使用separate，而无需对任何值进行硬编码。

tidyr::separate(df, names(df), sep = ",", into = strsplit(names(df), "\\.")[[1]])

#  Ozone Solar Wind Temp Month Day
#1    41   190  7.4   67     5   1
#2    36   118    8   72     5   2
#3    12   149 12.6   74     5   3
#4    18   313 11.5   62     5   4
#5    NA    NA 14.3   56     5   5

仅使用已知的基数R可以使用列表中的strsplit和rbind将逗号分隔的字符串分开，并使用setNames分配名称。

setNames(do.call(rbind.data.frame, strsplit(as.character(df[[1]]), ",")), 
                 strsplit(names(df), "\\.")[[1]])

数据

df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = structure(c(4L, 
3L, 1L, 2L, 5L), .Label = c("12,149,12.6,74,5,3", "18,313,11.5,62,5,4", 
"36,118,8,72,5,2", "41,190,7.4,67,5,1", "NA,NA,14.3,56,5,5"), class = 
"factor")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

Answer 2

我们可以使用base R在read.csv中轻松做到这一点

out <- read.csv(text = df[[1]], header = FALSE, col.names = scan(text = names(df), 
             what = "", sep=".", quiet = TRUE), stringsAsFactors = FALSE)
out
#  Ozone Solar Wind Temp Month Day
#1    41   190  7.4   67     5   1
#2    36   118  8.0   72     5   2
#3    12   149 12.6   74     5   3
#4    18   313 11.5   62     5   4
#5    NA    NA 14.3   56     5   5

数据

df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = c("41,190,7.4,67,5,1", 
 "36,118,8,72,5,2", "12,149,12.6,74,5,3", "18,313,11.5,62,5,4", 
 "NA,NA,14.3,56,5,5")), class = "data.frame", row.names = c("1", 
  "2", "3", "4", "5"))

将一列分成五列

2 个答案:

数据