请问,如何将这样的数据集读入R?它实际上比这个大,我只是因为空间而试图最小化它。
"x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" "x10" "x11" "x12" "x13" "x14" "x15" "x16" "x17" "x18" "x19" "x42" "x43" "x44" "x45" "x46" "x47" "x48" "x49" "x50" "x51" "x52" "x53" "x54" "x55" "x56" "x57" "x58" "x59" "x60" "x61" "x62" "x63" "x64" "x65" "x66" "x67" "x68" "x69" "x70" "x71" "x72" "x73" "x74" "x75" "x76" "x77" "x78" "x79" "x80" "x81" "x82" "x83" "x84" "x85" "x86" "x87" "x88" "x89" "x90" "x91" "x92" "x93" "x94" "x95" "x96" "x97" "x98" "x99" "x100" "x101" "x102" "x103" "x104" "x105" "x106" "x107" "x108" "x109" "x110" "x111" "x112" "x113" "x114" "x115" "x116" "x117" "x118" "x119" "x120" "x121" "x122" "x123" "x124" "x201" "x202" "x203" "x204" "x205" "x206" "x207" "x208" "x209" "x210" "x211" "x212" "x213" "x214" "x215" "x216" "x217" "x218" "x219" "x220" "nature"
"1" 7 7 0 3 20205 486 19550 6769.2809 118 63 38 105 2 0 0.747 15655.4802 7 382.9968 348.7057 0 0 16 80 0 12123 1 0 0 0 0 0 0 0 1 1 0 0 0 0 17 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 13 1 0 0 0 1 5 0 9 0 1 0 1 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 5.90860829371 0.730683213637 "0"
"2" 13 13 0 2 37402 502 34626 10860.0676 115 49 40 93 2 0 0.9884 16870.0524 7 477.0312 397.7413 0 1 19 81 0 31780 0 1 1 0 0 0 0 0 1 0 0 0 0 0 19 1 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 17 0 0 1 0 0 2 0 1 5 1 0 1 2 0 0 0 0 0 2 0 0 1 0 0 0 2 1 2 0 8.32539208743 0.869155217211 "0"
"3" 8 7 0 2 132811 471 122729 6206.9286 222 86 108 196 1 1 0.948 6115.3969 7 295.067 221.8416 0 1 18 79 0 117765 0 0 0 0 0 0 0 0 1 0 0 0 0 0 17 1 1 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 2 0 0 0 1 0 0 1 0 0 0 0 0 0 5 0 0 1 0 0 0 0 0 1 0 5 0 1 0 0 0 29 1 98 2 0 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 0 10.5941656151 0.645706574667 "0"
"4" 15 15 0 3 231497 468 228811 9623.3898 347 134 167 321 1 0 1.4357 14400.1809 7 195.8632 207.8142 0 0 16 76 0 210360 0 0 0 1 0 0 0 1 0 0 0 0 0 0 16 1 1 0 0 0 0 0 0 0 2 0 1 0 0 0 1 0 0 0 0 1 1 3 1 5 5 1 1 0 5 0 0 0 0 0 1 0 0 3 2 0 0 0 0 1 0 5 0 0 0 7 0 1 0 0 0 262 0 71 1 0 0 0 1 0 2 0 0 0 1 0 1 0 1 0 0 0 1 0 0 4.88991089556 0.355427710536 "0"
"5" 153 161 0 2 3637632 377715 3416943 15250.239 34629 22108 12732 34931 1 0 355.1026 2494780.1981 2384 60.8852 89.4526 1 1 18 83 0 365 0 0 0 0 0 0 0 1 1 0 0 1 0 0 18 1 2 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 819 0 1 2 0 0 3 0 0 1 0 1 0 0 1 0 0 0 1 0 10.9453030622 0.304128072824 "0"
答案 0 :(得分:1)
只要您的数据文件小于(比如几GB)并且您有足够的RAM,请使用read.table()
。这是read.csv()
等的基本功能。只是:
data <- read.table(file=file.choose(), sep=" ", header = TRUE)
鲍勃是你的叔叔。
请注意,file.choose()
会打开一个简单的对话框来选择您的文件,header=TRUE
表示数据集的第一行是列名(这似乎是您的情况),而sep=" "
表示您的separator(只要没有数据是带空格的字符串。
如果你有非常大型数据集,请考虑学习使用稍微笨拙但又方便的data.table
包。