R如何从数据框列中删除特殊字符?

时间:2020-03-19 14:20:13

标签: r

使用tidyverse,我想从“教育”列中删除特殊字符,以便只显示硕士或学士学位。由于我使用的是Tidyverse,因此我想以管道和保持数据框为例进行说明:

<html>
  <head>    
    <script src="https://cdn.jsdelivr.net/npm/publicalbum@latest/embed-ui.min.js" async></script>
  </head>
  <body>
    <div class="pa-gallery-player-widget" style="width:100%; height:480px; display:none;"
      data-link="https://photos.google.com/share/AF1QipM8qglM9SYDwhbTlTsBwzMFNnUy3ae_MhZ6G0MrT2R3M208j2EeB6EOiSj7Q6BUgg?key=aGZmTThnOHZIS0NzSlZaaWwtZEtJZkRnUjlXVjBn"
      data-title="Bondy"
      data-description="8 new photos added to shared album"
      data-background-color="#ffffff">
      <object data="https://lh3.googleusercontent.com/lpIrmwAAeI-p6U3aI4dsEfGTBAGIFueRj5B-s-5MiDT85gHXGYYgEh-JDDs2Ctdor2wj-JHJYmTbhBT9sqDlcfOwOseTPFrVsZWqdWqMss7ddUhAzFfeytr0M12MUX9GPGK9bekgiA=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/6kQS6AXg7Lu0u1GgfcZU7WwAhVcjkctwjChyJ4QgpVRp9pssLWMGcOoXfu7iaEr32P6GhS1ePayrHJeJEPcZTKCf_FeQMkx5j2bcBL-rT79fR1HOff8RmuV1vF-e-3mkdWyDu8mnpQ=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/OVm-Eja_ulFg2RW-JjDmhmOZH7qx343wTfqP4WAHyLSvgJsGhtSBW1Og8VG7pFa-WPDqvW1xKKb4PHK_WoPpXq9MhNTtzT2ITH7oe0mkRMO3C6zHjRbedZNY2w8VpAZUVeuEG47qqA=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/HO9K0UCj1VUhDe9bJEomkVt6So2rKnwT_V6Il9Nx734bONRRi1QU34aCxPM-SJPXi3Lfp_K5zQcu5CcHFws4bwq6yQKvROEO7KISdewHfYBvud-GY22tm4kC3yGmxbC21l0nIr4O_g=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/nayFhxGXP6pCAstQlKYGbd662JjpBoeA15xyF5rVoXhefkeuYeAQrB-Pq0AkPBbiFOgdeGcbCOSdeV2RCe-tQrA5ufJ1muw5oHUr01CtCRNx-9B5tI_h1lzUkC0z3tWOrfMQvp4SqA=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/dmQ4e_p0Z5kDRNw5R2RuH1XMQDPuEaUThXGp3rPiroV6nYwhIktZe5Zlc1olbZywP0YLJBDZJSoyL76zNTbGJIvpXyUE4GFhanvAP_MCrL_ZLv9ayNX47yYba4iPlK_glqkui_KGVw=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/LVHXYLdL3kcDIT8TBIC7jfwsKCZKVYpdAaa49PhTG8wgziuC36N1OoFaQEC_R4GHZp-YJ-VO8fLU5P6zarsatwbikOVrqlqxDMKh6y-_2-rlIzGpb6bxbm4LFUDS3pUJKPKG-TVVWw=w1920-h1080"></object>
      <object data="https://lh3.googleusercontent.com/2Um38iEfoYKBoe6l4_YDiYNNsnnz_4sBm4FF-ogxveFxNJOkMIA-7kcHk8c5WA3-9yUc3670lh_v2ZoFBkh84r4_nitdyg9kJ3tuUUJ1hdFk8D5cvLBBdHcX2jIa-ia_rAjLr-Y3ug=w1920-h1080"></object>
    </div>    
  </body>
</html>

2 个答案:

答案 0 :(得分:1)

这就是regular expressions的用途:

gsub("[^A-Za-z]", "", c("Master’s ","Professional ","Bachelor’s"))

产生:

[1] "Masters"      "Professional" "Bachelors"   

答案 1 :(得分:1)

dplyr

data.frame(Education = c("Master’s ","Professional ","Bachelor’s")) %>% 
   mutate(Education = str_replace(Education,"’",""))
      Education
1      Masters 
2 Professional 
3     Bachelors