如何在制表符分隔的文件(txt)中替换逗号后跟逗号(,)或点(。)的文本?

时间:2018-06-26 10:14:43

标签: autohotkey

我是autohotkey的新手。我有一个脚本可以帮助我缩短不需要的单词,并且在尝试替换后跟逗号或点的文本时遇到问题,这是我的脚本:

[(ngModel)]="occupations" 

这是我的替换文件的一部分:

#NoEnv
#SingleInstance force
SetWorkingDir, %A_ScriptDir%
SendMode, Input
; -- Ctrl + SPACE -> Select all text + replace whole words only + title case
^SPACE::
NonCapitalized := "a|an|in|is|of|the|this|with" ; List of words that         shouldn't be capitalized, separated by pipes
ReplacementsFile := "replacements.txt" ; Path to replacements file (tab     delimited file with 2 columns, UTF-8-BOM, CR+LF)

Send, ^a ; Selects all text
Gosub, SelectToClip ; Copies the selected text to the clipboard
FileRead, Replacements, % ReplacementsFile ; Reads the replacements file
If ErrorLevel ; Error message if file is not found
{
MsgBox, % "File not found: " ReplacementsFile
Return
}

StringUpper, Clipboard, Clipboard, T ; Whole clipboard to title case
Clipboard := RegExReplace(Clipboard, "i)(?<![!?.]) \b(" NonCapitalized ")\b",     " $L1") ; Changes to lowercase all words from the list "NonCapitalized", except     those preceded by new line/period/exclamation mark/question mark
pos := 0
While pos := RegExMatch(Replacements, "m`a)^([^\t]+)\t(.*)$", FoundReplace,     pos + 1) ; Gets all replacements from the tab delimited file
Clipboard := RegExReplace(Clipboard, "i)\b" FoundReplace1 "\b",     FoundReplace2) ; Replaces all occurrences in the clipboard

; add exceptions
Clipboard := StrReplace(Clipboard, "Vice President,", "")
Clipboard := StrReplace(Clipboard, "Director,", "")
Clipboard := StrReplace(Clipboard, "Senior Vice President,", "")

; = End of exceptions

Clipboard := RegExReplace(Clipboard, "^\s+|\s+(?=([\s,;:.]))|\s$") ; Removes     extra spaces
Send, ^v ; Pastes the clipboard
Return

SelectToClip:
Clipboard := ""
Send, ^c
ClipWait, 0
If ErrorLevel
Exit
Sleep, 50
Return

我的问题是,如何在制表符分隔文件中添加紧跟逗号(,)或点(。)的文本,而不是在AHK文件中添加更多行?因为如您所知,它不能将逗号和点理解为文本。

非常感谢您的时间和帮助!

1 个答案:

答案 0 :(得分:0)

  1. 请缩进,否则您的代码将 很多 难以阅读

  2. 在正则表达式中,\b assertion requires a sequence of a word character and a non-word character使您的代码无法处理以逗号或点,非单词字符开头的字符串。

      

    ... \ b和\ B,因为它们是根据\ w和\ W定义的。
      ...
      单词边界是主题字符串中当前字符和上一个字符都不都匹配\ w或\ W的位置(即,一个匹配\ w,另一个匹配\ W),或者字符串的开头或结尾,如果第一个或最后一个字符分别与\ w匹配。

以下经过测试可以正常工作:

#NoEnv
#SingleInstance force
SetWorkingDir %A_ScriptDir%
SendMode Input
; -- Ctrl + SPACE -> Select all text + replace whole words only + title case
^SPACE::
FunctionNameOfYourChoice() {
    ; Using static vars allows you to avoid reading the file over and over on each key press.
    Static NonCapitalized   := "a|an|in|is|of|the|this|with" ; List of words that shouldn't be capitalized, separated by pipes
         , ReplacementsFile := "replacements.txt" ; Path to replacements file (tab delimited file with 2 columns, UTF-8-BOM, CR+LF)
         , Replacements     := ReadReplacements(ReplacementsFile)

    Send ^a ; Selects all text
    SelectToClip() ; Copies the selected text to the clipboard
    If ErrorLevel { ; Error message if file is not found
        MsgBox % "File not found: " ReplacementsFile
        Return
    }

    ; 3. StringUpper is deprecated in v2.
    ; 4. Better to work on a plain variable than on the clipboard in terms of performance and reliability.
    cbCnt := Format("{:T}", Clipboard)   ; Whole clipboard to title case
    ; Changes to lowercase all words from the list "NonCapitalized", except those preceded by new line/period/exclamation mark/question mark
    cbCnt := RegExReplace(cbCnt, "i)(?<![!?.]) \b(" NonCapitalized ")\b", " $L1")
    ; Goes through each pair of search and replacement strings
    Loop Parse, Replacements, `n, `r
        FoundReplace := StrSplit(A_LoopField, "`t")
        ; Replaces all occurrences in the clipboard
        , cbCnt := RegExReplace(cbCnt, "i)(?<!\w)\Q" FoundReplace.1 "\E(?!\w)", FoundReplace.2)   ; 5.
    cbCnt := RegExReplace(cbCnt, "(?<=\w-)([a-z])", "$U1")   ; 6.
/*
    ; Now the following can be included in the replacements.txt file.
    cbCnt := StrReplace(cbCnt, "Vice President,")
    cbCnt := StrReplace(cbCnt, "Director,")
    cbCnt := StrReplace(cbCnt, "Senior Vice President,")
*/
    ; Removes extra spaces
    ; This also removes all newlines. Are you sure you want to do this?
    Clipboard := RegExReplace(cbCnt, "^\s+|\s+(?=([\s,;:.]))|\s$")
    Send ^v ; Pastes the clipboard
}

SelectToClip() {
    Clipboard := ""
    Send ^c
    ClipWait 0.5   ; Specifying 0 wouldn't be a very good idea.
    If ErrorLevel
        Exit
    Sleep 50
}

ReadReplacements(path) {
    FileRead, Replacements, % path
    Return Replacements
}


编辑

  1. 是的,第二个正则表达式(其中的第一个断言)中有一个错字,该错字已得到纠正。与“和”的问题将不再重复。

  2. 我添加了另一个RegExReplace作为一种不够优雅的临时措施,用于解决您所描述的带连字符的单词的问题,但请注意,它本质上是一个 non琐碎问题,因为它们的大小写取决于语义。