删除linux文件中的特殊字符

时间:2011-04-06 22:44:12

标签: linux special-characters

我有很多文件* .java,* .xml。但是一个人用西班牙语写了一些评论和字符串。我一直在网上搜索如何删除它们。

我尝试find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java只是作为一个例子,如何从子文件夹中的许多其他文件中删除这些字符?

4 个答案:

答案 0 :(得分:0)

如果这是您真正想要的,您可以使用find,几乎就像您使用它一样。

find -type f \( -iname '*.java' -or -iname '*.xml' \) -execdir sed -i 's/[áíéóúñ]//g' '{}' ';'

差异:

  • 如果未提供路径,则路径.是隐式的。
  • 此命令仅对* .java和* .xml文件进行操作。
  • execdirexec更安全(阅读手册页)。
  • -i告诉sed修改文件参数。阅读手册页以了解如何使用它进行备份。
  • {}表示find将替换的路径参数。
  • ;find / exec的{​​{1}}语法的一部分。

答案 1 :(得分:0)

几乎那里:)

find . -type f -exec sed -i 's/[áíéóúñ]//g' {} \;
                         ^^                 ^^

来自sed(1)

   -i[SUFFIX], --in-place[=SUFFIX]
          edit files in place (makes backup if extension supplied)

来自find(1)

   -exec command ;
          Execute command; true if 0 status is returned.  All
          following arguments to find are taken to be arguments to
          the command until an argument consisting of `;' is
          encountered.  The string `{}' is replaced by the current
          file name being processed everywhere it occurs in the
          arguments to the command, not just in arguments where it
          is alone, as in some versions of find.  Both of these
          constructions might need to be escaped (with a `\') or
          quoted to protect them from expansion by the shell.  See
          the EXAMPLES section for examples of the use of the -exec
          option.  The specified command is run once for each
          matched file.  The command is executed in the starting
          directory.   There are unavoidable security problems
          surrounding use of the -exec action; you should use the
          -execdir option instead.

答案 2 :(得分:0)

tr是工作的工具:

NAME
       tr - translate or delete characters

SYNOPSIS
       tr [OPTION]... SET1 [SET2]

DESCRIPTION
       Translate, squeeze, and/or delete characters from standard input, writing to standard out‐
       put.

       -c, -C, --complement
              use the complement of SET1

       -d, --delete
              delete characters in SET1, do not translate

       -s, --squeeze-repeats
              replace each input sequence of a repeated character that is listed in SET1  with  a
              single occurrence of that character

通过tr -d áíéóúñ输入你的输入可能会做你想要的。

答案 3 :(得分:0)

为什么你只想删除带有变音符号的字符?可能值得删除代码不在0-127范围内的所有字符,因此如果您确定文件不应包含更高的ascii,则删除正则表达式将为s/[\0x80-\0xFF]//g