数据帧R的条件子集

时间:2016-11-29 18:58:26

标签: r conditional

让数据框为:

set.seed(123)
df<-data.frame(name=sample(LETTERS,260,replace=TRUE),
               hobby=rep(c("outdoor","indoor"),260),chess=rnorm(1:10))

我将用于从df中提取的条件是:

df_cond<-df %>% group_by(name,hobby) %>%
    summarize(count=n()) %>%
    mutate(sum.var=sum(count),sum.name=length(name)) %>%
    filter(sum.name==2) %>%
    mutate(min.var=min(count)) %>%
    mutate(use=ifelse(min.var==count,"yes","no")) %>%
    filter(grepl("yes",use))

我想从df中随机提取df_cond中与df组合的行({名称,爱好,数量)组合,以及其他%in%。我在组合samplehead(df_cond) name hobby count sum.var sum.name min.var use <fctr> <fctr> <int> <int> <int> <int> <chr> 1 A indoor 2 6 2 2 yes 2 B indoor 8 16 2 8 yes 3 B outdoor 8 16 2 8 yes 4 C outdoor 6 14 2 6 yes 5 D indoor 10 24 2 10 yes 6 E outdoor 8 18 2 8 yes 时遇到了一些麻烦。感谢您的任何线索!

编辑:例如:

df

使用上面的数据框,我想用df中的A +室内(row1)组合随机提取2行(= count), 8行,组合B +室内(第2行)来自m2<-df_cond %>% mutate(data = map2(name, hobby, function(x, y) {df %>% filter(name == x, hobby == y)})) %>% ungroup() %>% select(data) %>% unnest() test<-m2 %>% group_by(name,hobby) %>% summarize(num.levels=length(unique(hobby))) %>% ungroup() %>% group_by(name) %>% summarize(total_levels=sum(num.levels)) %>% filter(total_levels>1) fin<-semi_join(m2,test) ....等等。

结合@denrous和@Jacob的答案来获得我需要的东西。像这样:

var size = 500;
var img = 'Image.jpg';

window.onload = function() {

      createWGL();
      render();

    }

    // render
    //
    function render() {

      requestAnimationFrame( render );

      if(window.mat)
        mat.uniforms.time.value = now();

      ctx.render( scn, cam );

    }

    // create renderer
    //
    function createWGL() {

      // check desktop/mobile
      window.desk = !(/Android|webOS|iPhone|iPad|BlackBerry|Windows Phone|Opera Mini|IEMobile|Mobile/i.test(navigator.userAgent));

      window.ctx = new THREE.WebGLRenderer({antialias:window.desk});
      ctx.setClearColor( 0xffffff );
      ctx.setPixelRatio( window.devicePixelRatio );
      ctx.setSize( size, size );

      // camera
      window.cam = new THREE.PerspectiveCamera( 90, 1, 1, 30 );
      cam.position.z = 25;

      // scene
      window.scn = new THREE.Scene();

      // canvas
      window.cvs = createCanvas();
      scn.add( cvs );
      loadCanvasTexture( img );

      // clear viewport
      ctx.render( scn, cam );
      document.body.appendChild( ctx.domElement );

    }

    // now
    //
    function now(){

      return performance.now() * 0.001;

    }

    // load canvas texture
    //
    function loadCanvasTexture( path ) {

      if(window.tex)
        window.tex.dispose();

      cvs.visible = false;

      window.tex = new THREE.TextureLoader().load( path, function(){
        cvs.visible = true;
      });
      window.tex.anisotropy = ctx.getMaxAnisotropy();
      window.mat.uniforms.tex.value = window.tex;

    }

    // create canvas
    //
    function createCanvas() {

      window.mat = new THREE.RawShaderMaterial({
        uniforms: {
          time: { value: now() },
          tex: { value: null }
        },
        vertexShader: 'precision mediump float;precision mediump int;uniform mat4 modelViewMatrix;'+
          'uniform mat4 projectionMatrix;attribute vec2 pos;uniform float time;varying vec2 uv;varying float amb;'+
          'float d(float y){return cos(sin(time/2.)+time/2.+y/2.14)*sin(time+y/4.17)*(.5-y/40.)*1.5;}'+
          'void main(){vec3 p=vec3( pos.x+sin(time/3.)*(.5-pos.y/40.), pos.y+sin(time)*(.5-pos.y/40.)/2., d(pos.y));amb=(d(pos.y-1.)-d(pos.y+1.))/4.;'+ 
          'uv=vec2(pos.x/40.+.5,pos.y/40.+.5);gl_Position=projectionMatrix*modelViewMatrix*vec4(p,1.);}',
        fragmentShader: 'precision mediump float;precision mediump int;uniform sampler2D tex;varying vec2 uv;varying float amb;'+
          'void main(){vec4 col=texture2D(tex,uv)+amb;gl_FragColor=vec4(col.xyz,1.);}'
      });

      var d = 40,d2=~~(d/2),i,j,k,n,fi,v,m,z1=-1,z2;

      fi = new Uint16Array( d * d * 6 );
      v = new Int8Array( (d+1) * (d+1) * 2 );
      for(j=0;j<=d;j++)
        for(i=0;i<=d;i++) {
          k = i + j*(d+1);
          v[k*2] = i - d2;
          v[k*2+1] = j - d2;
          if(i<d&&j<d) {
            n = (i + j*d) * 6;
            fi[n] = k;
            fi[n+1] = k + 1;
            fi[n+2] = k + d + 1;
            fi[n+3] = k + d + 1;
            fi[n+4] = k + 1;
            fi[n+5] = k + d + 2;
          }
        }

      for(i=0,j=-1;i<fi.length;i++)
        if(j<fi[i])
          j = fi[i];

      m = new THREE.Mesh( new THREE.BufferGeometry(), mat );
      m.geometry.setIndex( new THREE.BufferAttribute( fi, 1 ));
      m.geometry.addAttribute( 'pos', new THREE.BufferAttribute( v, 2 ));

      return m;

    }

3 个答案:

答案 0 :(得分:3)

如果我理解正确,您可以使用purrr来实现您想要的目标:

df_cond %>% 
  mutate(data = map2(name, hobby, function(x, y) {filter(df, name == x, hobby == y)})) %>% 
  mutate(data = map2(data, count, function(x, y) sample_n(x, size = y))) 

如果你想要与df相同的形式:

df_cond %>% 
  mutate(data = map2(name, hobby, function(x, y) {df %>% filter(name == x, hobby == y)})) %>% 
  mutate(data = map2(data, count, function(x, y) sample_n(x, size = y))) %>% 
  ungroup() %>% 
  select(data) %>% 
  unnest()

答案 1 :(得分:1)

根据OP澄清编辑。

必须有更好的方法,但我会使用循环:

library(dplyr)

master_df <- data.frame()

for (i in 1:nrow(df_cond)){
  name = as.character(df_cond[i, 1])
  hobby = as.character(df_cond[i, 2])
  n = as.numeric(df_cond[i, 3])

  temp_df <- df %>% filter(name == name, hobby == hobby)
  temp_df <- sample_n(temp_df, n)
  master_df <- rbind(master_df, temp_df)
      }

答案 2 :(得分:0)

不清楚这是否正是您想要的,但您可能正在寻找left_join

df %>% 
    left_join(df_cond, by = "name")