使用dplyr标记组中列的两个值之间的行

时间:2018-12-06 06:55:08

标签: r filter dplyr

我的虚拟数据如下所示

 let mongodb = env.mongodb;
 let url = `mongodb://${mongodb.user}:${mongodb.pwd}@${mongodb.host}:${mongodb.port}/${mongodb.dbName}`;

 @Module({
    imports: [MongooseModule.forRoot(url)],
    controllers: [AppController, FolderController],
    providers: [AppService, FolderService],
 })
 export class AppModule { }

我的预期输出是,对于每个df = data.frame(name = c(rep("Anna",8),rep("Jenny",7)), id = c(100,100,100,100,100,100,100,100,250,250,250,250,250,250,250), time = c("t2","t3","t5","t1","t7","t2","t1","t5","t1","t2","t6","t2","t8","t6","t5"), stringsAsFactors = F) > df name id time 1 Anna 100 t2 2 Anna 100 t3 3 Anna 100 t5 4 Anna 100 t1 5 Anna 100 t7 6 Anna 100 t2 7 Anna 100 t1 8 Anna 100 t5 9 Jenny 250 t1 10 Jenny 250 t2 11 Jenny 250 t6 12 Jenny 250 t2 13 Jenny 250 t8 14 Jenny 250 t6 15 Jenny 250 t5 组,我希望使用{{1}在{strong> t2 和 t5 之间并包括其中的值id }}变量-每个组中会有多种情况,并且代码应能够排除一些破损的情况,如下例所示

flag

我使用time模式询问此问题,因为将来我可以添加更多分组变量以实现可伸缩性。我搜索了如何在dplyr函数中使用 > df name id time Flag 1 Anna 100 t2 1 2 Anna 100 t3 1 3 Anna 100 t5 1 4 Anna 100 t1 0 5 Anna 100 t7 0 6 Anna 100 t2 1 7 Anna 100 t1 1 8 Anna 100 t5 1 9 Jenny 250 t1 0 10 Jenny 250 t2 0 11 Jenny 250 t6 0 12 Jenny 250 t2 1 13 Jenny 250 t8 1 14 Jenny 250 t6 1 15 Jenny 250 t5 1 ,但并没有带来太多好处,我在Get rows between two values of a column using Python

中找到了一个等效的python

Edit1:每个组中有多个t2-t5部分需要标记。感谢@ronak提出来

预先感谢

2 个答案:

答案 0 :(得分:1)

应该有一个更好的选择,但这可行

library(tidyverse)

df %>%
  group_by(name) %>%
  mutate(flag  = +(row_number() %in% which(time == "t2"):which(time == "t5")))


#  name     id time   flag
#  <chr> <dbl> <chr> <dbl>
#1 Anna    100 t2        1
#2 Anna    100 t3        1
#3 Anna    100 t5        1
#4 Jenny   250 t1        0
#5 Jenny   250 t2        1
#6 Jenny   250 t3        1
#7 Jenny   250 t4        1
#8 Jenny   250 t5        1

这是假设每个组中只有一个“ t2”和“ t5”。

使用基数R ave

的相同逻辑
as.numeric(with(df, ave(time, name, FUN = function(x) 
      +(1:length(x) %in% which(x == "t2"):which(x == "t5")))))
#[1] 1 1 1 0 1 1 1 1

编辑

如果您有多个“ t2”和“ t5”,则无需考虑组,因为您仍然要标记它们。我们可以使用mapply并创建一个索引序列来将标志标记为1。

df$flag <- 0
df$flag[unlist(mapply(":", which(df$time == "t2"), which(df$time == "t5")))] <- 1

dplyr的相同版本是

df %>%
  mutate(flag = +(row_number() %in% 
          unlist(map2(which(time == "t2"), which(time == "t5"), seq))))

答案 1 :(得分:-1)

以下是您可以考虑的简单方法:

 HomeScreen.js
 ==========
 import React from 'react';
    import {
      Image,
      Platform,
      ScrollView,
      StyleSheet,
      Text,
      TouchableOpacity,
      View,
      StatusBar,
      Dimensions,
      LayoutAnimation,
      Alert,
      Linking,
    } from 'react-native';
    import { WebBrowser } from 'expo';
    import { BarCodeScanner, Permissions } from 'expo';
    import { MonoText } from '../components/StyledText';
    import DeviceInfo   from 'react-native-device-info';
    ...
    ...

Package.json
==========
{
  "main": "node_modules/expo/AppEntry.js",
  "scripts": {
    "start": "expo start",
    "android": "expo start --android",
    "ios": "expo start --ios",
    "eject": "expo eject",
    "test": "node ./node_modules/jest/bin/jest.js --watchAll"
  },
  "jest": {
    "preset": "jest-expo"
  },
  "dependencies": {
    "@expo/samples": "2.1.1",
    "expo": "^31.0.2",
    "link": "^0.1.5",
    "react": "16.5.0",
    "react-native": "https://github.com/expo/react-native/archive/sdk-31.0.0.tar.gz",
    "react-native-camera": "^1.4.3",
    "react-native-device-info": "^0.24.3",
    "react-native-permissions": "^1.1.1",
    "react-native-qrcode-scanner": "^1.1.0",
    "react-navigation": "^2.18.2"
  },
  "devDependencies": {
    "babel-preset-expo": "^5.0.0",
    "jest-expo": "^31.0.0"
  },
  "private": true
}

这将标记您所描述的并且可读的数据。

library(dplyr)

df %>%
    mutate(flag = ifelse(time %in% c("t2", "t3", "t4", "t5"), 1, 0))