根据一列的字符和另一列的条件创建新列

时间:2018-04-22 16:12:39

标签: r substring

我有一个包含2列的数据集

Filnename                                  Subject
161014_1_A1_B1_1880129006_1801004016_1     A1
161214_1_A1_B1_1861317003_1801206008_1     B1
170202_1_A2_B1_1860415029_1750730086_2     A2

我希望第三和第四列包含参与者代码

df$agent <- substr(df, start = 16, stop = 25)
df$partner <- substr(df, start = 27, stop = 36)

问题是我希望只有在&#34; A1&#34;或&#34; A2&#34;在“主题”列中。

如果有&#34; B1&#34;则会发生相反的情况。或B2&#34;:

df$partner <- substr(df, start = 16, stop = 25)
df$agent <- substr(df, start = 27, stop = 36)

结果应如下所示:

Filnename                                  Subject    agent       partner
161014_1_A1_B1_1880129006_1801004016_1     A1         1880129006  1801004016
161214_1_A1_B1_1861317003_1801206008_1     B1         1801206008  1861317003
170202_1_A2_B1_1860415029_1750730086_2     A2         1860415029  1750730086

我希望这个问题是可以理解的,并提前感谢你。

1 个答案:

答案 0 :(得分:1)

我们可以使用library(dplyr) df %>% mutate(agent = case_when(Subject %in% c("A1", "A2") ~ substr(Filnename, 16, 25), TRUE ~ substr(Filnename, 27, 36)), partner = case_when(Subject %in% c("A1", "A2") ~ substr(Filnename, 27, 36), TRUE ~ substr(Filnename, 16, 25))) # Filnename Subject agent partner #1 161014_1_A1_B1_1880129006_1801004016_1 A1 1880129006 1801004016 #2 161214_1_A1_B1_1861317003_1801206008_1 B1 1801206008 1861317003 #3 170202_1_A2_B1_1860415029_1750730086_2 A2 1860415029 1750730086

i1 <- !grepl("A\\d+", df$Subject)
df$new <- df$Filnename
df$new[i1] <-  sub("_(\\d{10})_(\\d{10})", "_\\2_\\1", df$Filnename[i1])

或者另一种选择是根据&#39;主题&#39;重新排列子字符串。值

extract

然后执行library(tidyr) df %>% extract(new, into = c("agent", "partner"), ".*_([0-9]{10})_([0-9]{10}).*") # Filnename Subject agent partner #1 161014_1_A1_B1_1880129006_1801004016_1 A1 1880129006 1801004016 #2 161214_1_A1_B1_1861317003_1801206008_1 B1 1801206008 1861317003 #3 170202_1_A2_B1_1860415029_1750730086_2 A2 1860415029 1750730086

%>%

或在library(stringr) df %>% mutate(tmp = ifelse(str_detect(Subject, "A\\d+"), Filnename, str_replace(Filnename, "_(\\d{10})_(\\d{10})", "_\\2_\\1"))) %>% extract(tmp, into = c("agent", "partner"), ".*_([0-9]{10})_([0-9]{10}).*") # Filnename Subject agent partner #1 161014_1_A1_B1_1880129006_1801004016_1 A1 1880129006 1801004016 #2 161214_1_A1_B1_1861317003_1801206008_1 B1 1801206008 1861317003 #3 170202_1_A2_B1_1860415029_1750730086_2 A2 1860415029 1750730086

中使用上述所有内容
df <- structure(list(Filnename =  c("161014_1_A1_B1_1880129006_1801004016_1", 
 "161214_1_A1_B1_1861317003_1801206008_1", "170202_1_A2_B1_1860415029_1750730086_2"
), Subject = c("A1", "B1", "A2")), .Names = c("Filnename", "Subject"
), class = "data.frame", row.names = c(NA, -3L))

数据

 <?php
    include('connection.php');
    $getid = $_GET['getid'];

    if (isset($_POST['userID'])) {
        $userID = $_POST['userID'];
    }
    if (isset($_POST['emailAddress'])) {
        $emailAddress = $_POST['emailAddress'];
    }
    if (isset($_POST['firstName'])) {
        $firstName = $_POST['firstName'];
    }
    if (isset($_POST['lastName'])) {
        $lastName = $_POST['lastName'];
    }

    if (isset($_POST['accessLevel'])) {
        $accesslevel = $_POST['accessLevel'];
    }

    if (isset($_POST['password'])) {
        $password = $_POST['password'];
    }



    $update = "UPDATE users SET userID='$userID', 
    emailAddress='$emailAddress', 
   firstName='$firstName', lastName='$lastName',accessLevel = 
   '$accesslevel',password='$password' WHERE userID = '$getid'";

     $return = mysqli_query($conn, $update) or die(mysqli_errno($conn));