具有几乎随机选择标准的多个data.frames

时间:2017-05-17 10:11:40

标签: r select dataframe data-manipulation

这是来自Extract multiple data.frames from one with selection criteria的后续问题。

假设数据与上例中的数据相同

df <- data.frame(x1 = runif(1000), x2 = runif(1000), x3 = runif(1000), 
             split = sample( c('SPLITMEHERE', 'OBS'), 1000, replace=TRUE, prob=c(0.1, 0.9) ))

基本上,我需要比引用示例中的解决方案更通用的解决方案。

即,某些月份的某些县(每月是一个.txt文件)只有一个表,因此只有一个'SPLITMEHERE'。

例如,第四个县只包含一个表,这意味着该县的组应该在第一个SPLITMEHERE结束,而不是第二个。此外,最后一个县由三个表组成,但这并不重要,因为它在最后,所以我可以轻松合并最后一组。

问题是,它并不总是第四个由一个表组成的县,有时还有其他的。

假设我们从上面的数据集中得到了这个:

              x1          x2          x3       split
1    0.591940061 0.635445182 0.304498259         SPLITMEHERE
2    0.510158838 0.170956885 0.881018211         OBS
3    0.938369076 0.495642515 0.171227120         OBS
4    0.366153042 0.464698494 0.550931566         OBS
5    0.051998873 0.222881187 0.934175135         OBS
6    0.706940809 0.735885367 0.666272118         SPLITMEHERE
7    0.244219533 0.340480033 0.144009797         OBS
8    0.546891246 0.024010211 0.151338479         OBS
9    0.032659978 0.174774606 0.576820824         OBS
10   0.641988559 0.575596526 0.911188682         OBS
11   0.111024861 0.969227957 0.643551420         OBS
12   0.179469011 0.052698538 0.199299193         OBS
13   0.199203707 0.429210222 0.525920379         SPLITMEHERE
14   0.837223042 0.556442838 0.881305105         OBS
15   0.628854814 0.874139058 0.199226364         OBS
16   0.618989684 0.784011205 0.038021599         OBS
17   0.421893407 0.394786134 0.519100402         OBS
18   0.126453054 0.926114653 0.687669218         OBS
19   0.739393898 0.938428464 0.110824400         OBS
20   0.582882966 0.198520021 0.942501112         OBS
21   0.143852453 0.963329219 0.993098109         OBS
22   0.249366828 0.242881240 0.486960755         OBS
23   0.060602695 0.797436479 0.432171847         SPLITMEHERE
24   0.013947914 0.028245990 0.489656647         OBS
25   0.795170730 0.541771474 0.122952446         OBS
26   0.786673408 0.284252650 0.305914856         OBS
27   0.591369056 0.321041728 0.285482027         OBS
28   0.899577535 0.468031873 0.588038383         SPLITMEHERE
29   0.955853329 0.552076328 0.825239050         OBS
30   0.634738808 0.050917396 0.730090024         OBS

假设打印输出中有三个县,我想要三个数据帧如下:

    df1
1    0.510158838 0.170956885 0.881018211         OBS
2    0.938369076 0.495642515 0.171227120         OBS
3    0.366153042 0.464698494 0.550931566         OBS
4    0.051998873 0.222881187 0.934175135         OBS
5    0.244219533 0.340480033 0.144009797         OBS
6    0.546891246 0.024010211 0.151338479         OBS
7    0.032659978 0.174774606 0.576820824         OBS
8    0.641988559 0.575596526 0.911188682         OBS
9    0.111024861 0.969227957 0.643551420         OBS
10   0.179469011 0.052698538 0.199299193         OBS

df2

1   0.837223042 0.556442838 0.881305105         OBS
2   0.628854814 0.874139058 0.199226364         OBS
3   0.618989684 0.784011205 0.038021599         OBS
4   0.421893407 0.394786134 0.519100402         OBS
5   0.126453054 0.926114653 0.687669218         OBS
6   0.739393898 0.938428464 0.110824400         OBS
7   0.582882966 0.198520021 0.942501112         OBS
8   0.143852453 0.963329219 0.993098109         OBS
9   0.249366828 0.242881240 0.486960755         OBS

df3


1   0.013947914 0.028245990 0.489656647         OBS
2   0.795170730 0.541771474 0.122952446         OBS
3   0.786673408 0.284252650 0.305914856         OBS
4   0.591369056 0.321041728 0.285482027         OBS
5   0.955853329 0.552076328 0.825239050         OBS
6   0.634738808 0.050917396 0.730090024         OBS

有什么想法吗?

0 个答案:

没有答案