我需要使用 RefSeq
之前发生的_
拆分列名NM
,而不拆分NM
之间的部分数字。
我需要将输出放在输入的新列中。
尝试过类似的事情:
strsplit(as.character(TargetScan$RefSeq),"_")
数据
> head(TargetScan)
Gene miRNA Site cont.score cont.score.perc
1 A1CF hsa-let-7a-5p 8mer-1a -0.051 12
2 A1CF hsa-let-7b-5p 8mer-1a -0.051 12
3 A1CF hsa-let-7c-5p 8mer-1a -0.051 12
4 A1CF hsa-let-7d-5p 8mer-1a -0.062 12
5 A1CF hsa-let-7e-5p 8mer-1a -0.051 12
6 A1CF hsa-let-7f-5p 8mer-1a -0.051 12
RefSeq
1 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
2 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
3 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
4 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
5 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
6 NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933
出
> head(TargetScan)
Gene miRNA Site cont.score cont.score.perc
1 A1CF hsa-let-7a-5p 8mer-1a -0.051 12
2 A1CF hsa-let-7b-5p 8mer-1a -0.051 12
3 A1CF hsa-let-7c-5p 8mer-1a -0.051 12
4 A1CF hsa-let-7d-5p 8mer-1a -0.062 12
5 A1CF hsa-let-7e-5p 8mer-1a -0.051 12
6 A1CF hsa-let-7f-5p 8mer-1a -0.051 12
new1 new2 new3 new4 new5 new6
1 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
2 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
3 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
4 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
5 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
6 NM_001198820 NM_014576 NM_138932 NM_001198819 NM_001198818 NM_138933
答案 0 :(得分:3)
strsplit(x, "(?<=\\d)_", perl=T)[[1]]
#[1] "NM_001198820" "NM_014576" "NM_138932" "NM_001198819"
#[5] "NM_001198818" "NM_138933"
这种方法使用了后视。遵循字符串模式"(?<=\\d)_"
,我们匹配前面带有数字的下划线。
包含在所需输出的函数中:
library(tidyr)
separate(TargetScan, RefSeq, paste0("new", 1:6), "(?<=\\d)_")
# Gene miRNA Site cont.score cont.score.perc new1 new2
# 1 A1CF hsa-let-7a-5p 8mer-1a -0.051 12 NM_001198820 NM_014576
# 2 A1CF hsa-let-7b-5p 8mer-1a -0.051 12 NM_001198820 NM_014576
# 3 A1CF hsa-let-7c-5p 8mer-1a -0.051 12 NM_001198820 NM_014576
# 4 A1CF hsa-let-7d-5p 8mer-1a -0.062 12 NM_001198820 NM_014576
# 5 A1CF hsa-let-7e-5p 8mer-1a -0.051 12 NM_001198820 NM_014576
# 6 A1CF hsa-let-7f-5p 8mer-1a -0.051 12 NM_001198820 NM_014576
# new3 new4 new5 new6
# 1 NM_138932 NM_001198819 NM_001198818 NM_138933
# 2 NM_138932 NM_001198819 NM_001198818 NM_138933
# 3 NM_138932 NM_001198819 NM_001198818 NM_138933
# 4 NM_138932 NM_001198819 NM_001198818 NM_138933
# 5 NM_138932 NM_001198819 NM_001198818 NM_138933
# 6 NM_138932 NM_001198819 NM_001198818 NM_138933
答案 1 :(得分:0)
我会尝试使用<?php
#Connection to the database
function dbcon (){
try{
$db = new PDO('mysql:dbname=php_test;host=localhost','root','mysql');
}
catch (PDOException $e){
echo $e->getMessage();
exit();
}
return $db;
}
#Sanitize the input for preventing hacking attempts
function sanitize($data) {
$data = trim($data);
$data = stripslashes($data);
$data = htmlspecialchars($data);
return $data;
}
#Get the list of countries from the DB
function getCountries() {
$db = dbcon();
$query = "SELECT country FROM countries";
$stmt = $db->prepare($query);
$stmt->execute();
$countries = "";
while ($row = $stmt->fetch()) {
$countries .= '<option value= "'.$row['country'].'">'.$row['country'].'</option>';
}
return $countries;
}
$name = $email = $password = $password2 = $country = "";
$validForm = True;
#If it's a submission, validate the form
if ($_SERVER["REQUEST_METHOD"] == "POST") {
$db = dbcon();
#Name validation
$name = sanitize($_POST["name"]);
if ((strlen($name) < 2) || (strlen($name) > 50)) {
echo "<span style=\"color: #FF0000;\"> Name must have between 2 and 50 characters </span> <br>";
$name = "";
$validForm = False;
}
#Email validation
$email = sanitize($_POST["email"]);
if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
echo "<span style=\"color: #FF0000;\"> Check the format of the email </span> <br>";
$email = "";
$validForm = False;
}
else { #If it's a valid email, check whether or not it's already registered
$query = "SELECT email FROM users;";
$stmt = $db->prepare($query);
$stmt->execute();
$found = False;
while (($row = $stmt->fetch()) and (!$found)) {
if ($row["email"] == $email) {
$found = True;
}
}
if ($found) {
echo "<span style=\"color: #FF0000;\"> This email is already registered </span> <br>";
$email = "";
$validForm = False;
}
}
#Password validation
$password = sanitize($_POST["pass1"]);
if ((strlen($password) < 6) || (strlen($password) > 20)) {
echo "<span style=\"color: #FF0000;\"> Password must have between 6 and 20 characters </span> <br>";
$validForm = False;
}
else { #If it's a valid password, check whether or not both passwords match
$password2 = sanitize($_POST["pass2"]);
if ($password != $password2) {
echo "<span style=\"color: #FF0000;\"> Passwords don't match </span> <br>";
$validForm = False;
}
#If passwords match, hash the password
else {
$password = password_hash($password, PASSWORD_DEFAULT);
}
}
#We don't need to validate country because it's retrieved from the DB, but we sanitize it just in case a hacker modified the POST using a proxy
$country = sanitize($_POST["country"]);
#All checks done, insert into DB and move to success.php
if ($validForm) {
$query = "INSERT INTO users VALUES(:name, :email, :password, :country);";
$stmt = $db->prepare($query);
$stmt->bindParam(':name', $name);
$stmt->bindParam(':email', $email);
$stmt->bindParam(':password', $password);
$stmt->bindParam(':country', $country);
$stmt->execute();
header("Location: success.php");
}
}
?>
<html>
<head>
</head>
<body>
<!-- Submitting to this very file -->
<form action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]);?>" method="post">
<table>
<tr> <!-- Name -->
<td><label for="name">Name:</label></td>
<td><input type="text" name="name" value="<?php echo htmlspecialchars($name); ?>" required /></td>
<td><span style="color: #FF0000;">*</span></td>
<td>Between 2 and 50 characters</td>
</tr>
<tr> <!-- Email -->
<td><label for="email">Email:</label></td>
<td><input type="text" name="email" value="<?php echo htmlspecialchars($email); ?>" required/></td>
<td><span style="color: #FF0000;">*</span></td>
<td>Must be a valid address</td>
</tr>
<tr> <!-- Password -->
<td><label for="pass1">Password:</label></td>
<td><input type="password" name="pass1" required/></td>
<td><span style="color: #FF0000;">*</span> </td>
<td>Between 6 and 20 characters</td>
</tr>
<tr> <!-- Confirm password -->
<td><label for="pass2">Confirm password:</label></td>
<td><input type="password" name="pass2" required/></td>
<td><span style="color: #FF0000;">*</span></td>
<td>Must be the same as the password</td>
</tr>
<tr> <!-- Country -->
<td><label for="country">Country:</label></td>
<td><select name="country"> <?php echo getCountries(); ?></select></td>
<td><span style="color: #FF0000;">*</span></td>
</tr>
<tr>
<td><input type="submit"></td>
</tr>
</table>
</form>
</body>
</html>
替换NM
之前的下划线,然后在值上调用gsub
,如下所示:
strsplit
答案 2 :(得分:0)
使用正则表达式匹配您想要的文字并完成它我建议stringr::str_match_all
。
library(stringr)
s <- c('NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933',
'NM_001198820_NM_014576_NM_138932_NM_001198819_NM_001198818_NM_138933')
str_match_all(s, '([A-Za-z]{2}_\\d+)_?')
产量
[[1]]
[,1] [,2]
[1,] "NM_001198820_" "NM_001198820"
[2,] "NM_014576_" "NM_014576"
[3,] "NM_138932_" "NM_138932"
[4,] "NM_001198819_" "NM_001198819"
[5,] "NM_001198818_" "NM_001198818"
[6,] "NM_138933" "NM_138933"
[[2]]
[,1] [,2]
[1,] "NM_001198820_" "NM_001198820"
[2,] "NM_014576_" "NM_014576"
[3,] "NM_138932_" "NM_138932"
[4,] "NM_001198819_" "NM_001198819"
[5,] "NM_001198818_" "NM_001198818"
[6,] "NM_138933" "NM_138933"
之后,您可以在data.frame中的返回列表中组织数据。 请注意,第二列包含您想要的信息。