我想要实现的是匹配文本中的所有单词,但忽略那些以4个空格开头的行(在新行之前)。
示例
查找单词的文本文件:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat.
This must NOT be matched. Because it has 4 whitespaces at the beginning.
Lorem ipsum dolor sit amet. Ut enim ad minim veniam.
因此,以下行中的单词不应被视为匹配模式:
This must NOT be matched. Because it has 4 whitespaces at the beginning.
的代码
这是我的正则表达式,它可以找到所有单词:
\\b[A-Za-z]+\\b
我知道在Java的RegEx语法中有except
是^
符号,但我只知道如何在更简单的表达式中使用它。
答案 0 :(得分:2)
也许以下代码段可能是您想要实现目标的基础。
id_storage
输出
String[] lines = {"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do",
"eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut",
"enim ad minim veniam, quis nostrud exercitation ullamco laboris",
"nisi ut aliquip ex ea commodo consequat.",
"",
" This must NOT be matched. Because it has 4 whitespaces at the beginning.",
"",
"Lorem ipsum dolor sit amet. Ut enim ad minim veniam."};
for (String line : lines) {
if (!line.startsWith(" ")) {
String[] words = line.split("[\\p{IsPunctuation}\\p{IsWhite_Space}]+");
System.out.println("words = " + Arrays.toString(words));
}
}
PS:正则表达式是从this answer
借来的答案 1 :(得分:1)
以下应该这样做
"http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$' | % {"IP is $($_.matches.groups[1]) and path is $($_.matches.groups[2])"}
IP is 192.168.3.114 and path is compierews
它以negative lookbehind开头,所以它不会匹配前面有(?<!\s{4})\\b[A-Za-z]+\\b
的任何内容。