使用正则表达式创建功能矩阵?

时间:2018-10-06 00:36:34

标签: r regex dataframe matrix

假设我有一个101个变量的数据框。我选择一个所谓的Y作为因变量,其余100个所谓的x_1,X_2,...,X_ {100}作为自变量。

现在,我想创建一个包含100个独立变量的矩阵。有哪些直接方法?就像我建立线性回归模型时一样,只需使用“。”作为正则表达式,即lm(Y ~ ., _____)

1 个答案:

答案 0 :(得分:0)

您可以使用grep函数来提取与数据框的独立变量关联的列名。然后,您可以将其转换为矩阵。请参见下面的代码:

# simulation of the data frame with 100 measurements and 101 variables

n <- 100
df <- data.frame(matrix(1:101 * n, ncol = 101))
names(df) <- c(paste0("X_", 1:100), "Y")

# extract matrix of Xs
m_x <- as.matrix(df[, grep("^X", names(df))])
dimnames(m_x)

输出:

[[1]]
NULL

[[2]]
  [1] "X_1"   "X_2"   "X_3"   "X_4"   "X_5"   "X_6"   "X_7"   "X_8"   "X_9"   "X_10"  "X_11"  "X_12"  "X_13"  "X_14"  "X_15" 
 [16] "X_16"  "X_17"  "X_18"  "X_19"  "X_20"  "X_21"  "X_22"  "X_23"  "X_24"  "X_25"  "X_26"  "X_27"  "X_28"  "X_29"  "X_30" 
 [31] "X_31"  "X_32"  "X_33"  "X_34"  "X_35"  "X_36"  "X_37"  "X_38"  "X_39"  "X_40"  "X_41"  "X_42"  "X_43"  "X_44"  "X_45" 
 [46] "X_46"  "X_47"  "X_48"  "X_49"  "X_50"  "X_51"  "X_52"  "X_53"  "X_54"  "X_55"  "X_56"  "X_57"  "X_58"  "X_59"  "X_60" 
 [61] "X_61"  "X_62"  "X_63"  "X_64"  "X_65"  "X_66"  "X_67"  "X_68"  "X_69"  "X_70"  "X_71"  "X_72"  "X_73"  "X_74"  "X_75" 
 [76] "X_76"  "X_77"  "X_78"  "X_79"  "X_80"  "X_81"  "X_82"  "X_83"  "X_84"  "X_85"  "X_86"  "X_87"  "X_88"  "X_89"  "X_90" 
 [91] "X_91"  "X_92"  "X_93"  "X_94"  "X_95"  "X_96"  "X_97"  "X_98"  "X_99"  "X_100"