Cmd to powershell replace - 特殊字符

时间:2017-01-05 15:04:47

标签: powershell batch-file ascii apostrophe

我正在创建一个脚本,它将复制文件,重命名然后查看内部以删除某些特殊字符。其中一个特殊字符是某些ASCII撇号,我不能用键复制。我可以复制并粘贴它,但是替换功能不起作用。

打开文件>搜索奇怪的撇号'并替换为空。我希望用普通的撇号代替它,但我不知道这是怎么做的,而目前最大的问题是我不能把它变成"看"看" 34;这个奇怪的撇号在我修改的自动生成的文件中结束。任何帮助非常感谢。谢谢:))

档案中的撇号:'

正常撇号:'

这是我用来测试的批次中的一小部分。

        @echo off

    set YYMMDD=%DATE:~-2,2%%DATE:~-7,2%%DATE:~-10,2%
    set DDMMYYYY=%DATE:~-10,2%%DATE:~-7,2%%DATE:~-4,4%
    set YYYY-MM-DD=%DATE:~-4,4%-%DATE:~-7,2%-%DATE:~-10,2%

powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv') -replace '’', '' | Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"

    Echo Done

2 个答案:

答案 0 :(得分:1)

set "fileIn=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
set "fileOu=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
powershell -c "(gc '%fileIn%').Replace('‘‘','').Replace('’’','')|Out-File '%fileOu%'"

奇怪的撇号U+2019 右单引号,据说是一个结束语。它可以与不同的开头报价配对。在上面的示例中,U+2018 左单引号

Get-Help 'about_Quoting_Rules'

  

引号用于指定文字字符串。你可以附上   单引号(')或双引号中的字符串   (")。

事实上,PowerShell接受两个不同的引号:

  • 双引号 "
  • 单引号 '

AFAIK,所有这些引号都存在于大多数Windows ANSI 代码页(1252,1250,1257,1253,1251,1254,1255,1256,1258)中,因此它们可以在字面上使用ANSI - 保存.bat脚本 - 除了后一个引号 U+201B 单个高反转9引号 。在这种情况下,请使用$([char]0x201B)代替'‛‛',如下所示:

rem        cast [char] to `[string]`    ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( [string]$([char]0x201B) , '')"
rem                                             ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑

或如下:

rem [char] can't be empty so specify `[string]`           ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( $([char]0x201B) , [string]'')"
rem                                     ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑

分析和解释

下一个PowerShell代码段显示了Unicode数据库的摘录(字符名称以Quotation Mark结尾或包含Apostrophe):

PS D:> 0x22,0x27,0x00AB,0x00BB,0x2018,0x2019,0x201A,0x201B,0x201C,0x201D,0x201E,0x201F,
  0x2039,0x203A,0x2E42,0x301D,0x301E,0x301F,0x055A | Get-CharInfo | Format-Table -AutoSize

Char CodePoint                Category Description                               
---- ---------                -------- -----------                               
   " U+0022           OtherPunctuation Quotation Mark                            
   ' U+0027           OtherPunctuation Apostrophe                                
   « U+00AB    InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark 
   » U+00BB      FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark
   ‘ U+2018    InitialQuotePunctuation Left Single Quotation Mark                
   ’ U+2019      FinalQuotePunctuation Right Single Quotation Mark               
   ‚ U+201A            OpenPunctuation Single Low-9 Quotation Mark               
   ‛ U+201B    InitialQuotePunctuation Single High-Reversed-9 Quotation Mark     
   “ U+201C    InitialQuotePunctuation Left Double Quotation Mark                
   ” U+201D      FinalQuotePunctuation Right Double Quotation Mark               
   „ U+201E            OpenPunctuation Double Low-9 Quotation Mark               
   ‟ U+201F    InitialQuotePunctuation Double High-Reversed-9 Quotation Mark     
   ‹ U+2039    InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark 
   › U+203A      FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark
   ⹂ U+2E42           OtherNotAssigned Undefined                                 
   〝 U+301D            OpenPunctuation Reversed Double Prime Quotation Mark      
   〞 U+301E           ClosePunctuation Double Prime Quotation Mark               
   〟 U+301F           ClosePunctuation Low Double Prime Quotation Mark           
   ՚ U+055A           OtherPunctuation Armenian Apostrophe                       

(已修改的Get-CharInfo cmdlet的输出。)原始Get-CharInfo模块可从http://poshcode.org/5234下载。

下一个PowerShell脚本通过显示一些有效(在我的语言环境中无效)引号组合来完成上述结果:

$arrSingleQuotes = 
 ''' U+0027 Apostrophe '''                                ,
 ‘‘‘ U+2018 Left Single Quotation Mark ‘‘‘                ,
 ’’’ U+2019 Right Single Quotation Mark ’’’               ,
 ‚‚‚ U+201A Single Low-9 Quotation Mark ‚‚‚               ,
 ‛‛‛ U+201B Single High-Reversed-9 Quotation Mark ‛‛‛     ,
 ‘‘‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’’’ ,
 ’’’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘‘‘
'$arrSingleQuotes (any combination)'
 $arrSingleQuotes

$arrDoubleQoutes = 
 """ U+0022 Quotation Mark """                            ,
 “““ U+201C Left Double Quotation Mark “““                ,
 ””” U+201D Right Double Quotation Mark ”””               ,
 „„„ U+201E Double Low-9 Quotation Mark „„„               ,
 “““ U+201C (Left/Right) Double Quotation Mark U+201D ””” ,
 ””” U+201D (Right/Left) Double Quotation Mark U+201C “““
'$arrDoubleQoutes (any combination)'
 $arrDoubleQoutes

$noQuotes = @"
 « U+00AB Left-Pointing Double Angle Quotation Mark
 » U+00BB Right-Pointing Double Angle Quotation Mark
 ‟ U+201F Double High-Reversed-9 Quotation Mark
 ⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
 ‹ U+2039 Single Left-Pointing Angle Quotation Mark
 › U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
 〞U+301E Double Prime Quotation Mark
 〟U+301F Low Double Prime Quotation Mark
 ՚ U+055A Armenian Apostrophe                       
"@
'$noQuotes'
 $noQuotes

<强>输出

PS D:> D:\PShell\SO\41488245_quotes.ps1

$arrSingleQuotes (any combination)
' U+0027 Apostrophe '
‘ U+2018 Left Single Quotation Mark ‘
’ U+2019 Right Single Quotation Mark ’
‚ U+201A Single Low-9 Quotation Mark ‚
‛ U+201B Single High-Reversed-9 Quotation Mark ‛
‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’
’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘

$arrDoubleQoutes (any combination)
" U+0022 Quotation Mark "
“ U+201C Left Double Quotation Mark “
” U+201D Right Double Quotation Mark ”
„ U+201E Double Low-9 Quotation Mark „
“ U+201C (Left/Right) Double Quotation Mark U+201D ”
” U+201D (Right/Left) Double Quotation Mark U+201C “

$noQuotes
 « U+00AB Left-Pointing Double Angle Quotation Mark
 » U+00BB Right-Pointing Double Angle Quotation Mark
 ‟ U+201F Double High-Reversed-9 Quotation Mark
 ⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
 ‹ U+2039 Single Left-Pointing Angle Quotation Mark
 › U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
 〞U+301E Double Prime Quotation Mark
 〟U+301F Low Double Prime Quotation Mark
 ՚ U+055A Armenian Apostrophe                       

请注意,⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK存在于Unicode数据库中,并在PowerShell ISE中正确呈现。

附录:我发现更多引号的候选人(仅显示从<{1}}脚本中获取的结果):

Excerpt_From_UnicodeDataTxt.ps1

答案 1 :(得分:0)

我认为这是一个奇怪的反击角色。至少那是它的表现。

如果我这样做:

$text = "Weird ’ Normal ' Backtick ` Weird ’ "
$text.Replace("’","")

它给了我这个:

Weird  Normal ' Backtick Weird

这样做有用吗?

powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv').replace('’’', '') |
 Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"

通过将正常的后退标记加倍,它会使脚本从字面上理解字符。加倍奇怪的撇号似乎做了同样的事情,至少在我的测试中是有效的。