Perl在字符串中附加一个子字符串

时间:2019-08-15 20:09:40

标签: perl

我正在使用perl,我想在字符串中附加一个子字符串。

我有什么。

<users used_id="222" user_base="0" user_name="Mike" city="Chicago"/>
<users used_id="333" user_base="0" user_name="Jim Beans" city="Ann Arbor"/>

我想要的是在这样的用户名后面添加先生先生。

<users used_id="222" user_base="0" user_name="Mike, Mr" city="Chicago"/>
<users used_id="333" user_base="0" user_name="Jim Beans, Mr" city="Ann Arbor"/>

问题是我不知道该怎么做吗?到目前为止,这就是我所拥有的。请没有XML库。

#!/usr/bin/perl

use strict;
use warnings;

print "\nPerl Starting ... \n\n"; 

while (my $recordLine =<DATA>) 
{
    chomp($recordLine);
    #print "$recordLine ...\n";

    if (index($recordLine, "user_name") != -1) 
    {
        #Found user_name tag ... now appeand Mr. after the name at the end ... how?
        $recordLine =~ s/user_name=".*?"/user_name=" Mr"/g; 
        print "recordLine: $recordLine ...\n";

    }
}

print "\nPerl End ... \n\n"; 

__DATA__
<users used_id="222" user_base="0" user_name="Mike" city="Chicago"/>
<users used_id="333" user_base="0" user_name="Jim Beans" city="Ann Arbor"/>

1 个答案:

答案 0 :(得分:3)

你快到了。

正则表达式中的library(tidyverse) # purrr, tidyr, and dplyr library(repurrrsive) # The data comes from this package got_chars_mutilated <- got_chars got_chars_mutilated[[1]]["gender"] <- NULL # original problem map_dfr( got_chars_mutilated, magrittr::extract, c("name", "culture", "gender", "id", "born", "alive") ) #> Error: Argument 3 is a list, must contain atomic vectors # Option 1: # expanded unnest_*() functions coming soon in tidyr packageVersion("tidyr") #> [1] '0.8.99.9000' # automatic unnesting leads to ... unnest_wider() tibble(got = got_chars_mutilated) %>% unnest_auto(got) #> Using `unnest_wider(got)`; elements have {n_common} names in common #> # A tibble: 30 x 18 #> url id name culture born died alive titles aliases father mother #> <chr> <int> <chr> <chr> <chr> <chr> <lgl> <list> <list> <chr> <chr> #> 1 http… 1022 Theo… Ironbo… In 2… "" TRUE <chr … <chr [… "" "" #> 2 http… 1052 Tyri… "" In 2… "" TRUE <chr … <chr [… "" "" #> 3 http… 1074 Vict… Ironbo… In 2… "" TRUE <chr … <chr [… "" "" #> 4 http… 1109 Will "" "" In 2… FALSE <chr … <chr [… "" "" #> 5 http… 1166 Areo… Norvos… In 2… "" TRUE <chr … <chr [… "" "" #> 6 http… 1267 Chett "" At H… In 2… FALSE <chr … <chr [… "" "" #> 7 http… 1295 Cres… "" In 2… In 2… FALSE <chr … <chr [… "" "" #> 8 http… 130 Aria… Dornish In 2… "" TRUE <chr … <chr [… "" "" #> 9 http… 1303 Daen… Valyri… In 2… "" TRUE <chr … <chr [… "" "" #> 10 http… 1319 Davo… Wester… In 2… "" TRUE <chr … <chr [… "" "" #> # … with 20 more rows, and 7 more variables: spouse <chr>, #> # allegiances <list>, books <list>, povBooks <list>, tvSeries <list>, #> # playedBy <list>, gender <chr> # let's do it again, calling the proper function, and inspect `gender` tibble(got = got_chars_mutilated) %>% unnest_wider(got) %>% pull(gender) #> [1] NA "Male" "Male" "Male" "Male" "Male" "Male" #> [8] "Female" "Female" "Male" "Female" "Male" "Female" "Male" #> [15] "Male" "Male" "Female" "Female" "Female" "Male" "Male" #> [22] "Male" "Male" "Male" "Male" "Female" "Male" "Male" #> [29] "Male" "Female" # Option 2: # attack this column-wise # mapping the names gives access to the `.default` argument for missing elements c("name", "culture", "gender", "id", "born", "alive") %>% set_names() %>% map(~ map(got_chars_mutilated, .x, .default = NA)) %>% map(simplify) %>% as_tibble() #> # A tibble: 30 x 6 #> name culture gender id born alive #> <chr> <chr> <list> <int> <chr> <lgl> #> 1 Theon Greyjoy Ironborn <lgl [1… 1022 In 278 AC or 279 AC, at Py… TRUE #> 2 Tyrion Lannis… "" <chr [1… 1052 In 273 AC, at Casterly Rock TRUE #> 3 Victarion Gre… Ironborn <chr [1… 1074 In 268 AC or before, at Py… TRUE #> 4 Will "" <chr [1… 1109 "" FALSE #> 5 Areo Hotah Norvoshi <chr [1… 1166 In 257 AC or before, at No… TRUE #> 6 Chett "" <chr [1… 1267 At Hag's Mire FALSE #> 7 Cressen "" <chr [1… 1295 In 219 AC or 220 AC FALSE #> 8 Arianne Marte… Dornish <chr [1… 130 In 276 AC, at Sunspear TRUE #> 9 Daenerys Targ… Valyrian <chr [1… 1303 In 284 AC, at Dragonstone TRUE #> 10 Davos Seaworth Westeros <chr [1… 1319 In 260 AC or before, at Ki… TRUE #> # … with 20 more rows 序列代表原始输入中的某些字符,您还必须找到一种在输出中包含这些字符的方法。

这是通过模式中的捕获组(正则表达式的一部分用括号括起来)和对.*?(意味着第一个捕获组的内容)的引用来完成的。替换模式。

$1