今晚我试图解析一个文件中的单词,我想删除所有标点符号,同时保留大小写单词和空格。
String alpha = word.replaceAll("[^a-zA-Z]", "");
这取代了所有内容,包括空格。
在包含Testing, testing, 1, one, 2, two, 3, three.
的文本文件上操作,输出变为TESTINGTESTINGONETWOTHREE
但是,当我将其更改为
String alpha = word.replaceAll("[^a-zA-Z\\s]", "");
输出不会改变。
以下是完整的代码段:
public class UpperCaseScanner {
public static void main(String[] args) throws FileNotFoundException {
//First, define the filepath the program will look for.
String filename = "file.txt"; //Filename
String targetFile = "";
String workingDir = System.getProperty("user.dir");
targetFile = workingDir + File.separator + filename; //Full filepath.
//System.out.println(targetFile); //Debug code, prints the filepath.
Scanner fileScan = new Scanner(new File(targetFile));
while(fileScan.hasNext()){
String word = fileScan.next();
//Replace non-alphabet characters with empty char.
String alpha = word.replaceAll("[^a-zA-Z\\s]", "");
System.out.print(alpha.toUpperCase());
}
fileScan.close();
}
}
file.txt有一行,显示Testing, testing, 1, one, 2, two, 3, three.
我的目标是输出读取Testing Testing One Two Three
我只是在正则表达式中做错了什么,或者我还需要做些什么呢?如果它是相关的,我正在使用32位Eclipse 2.0.2.2。
答案 0 :(得分:3)
System.out.println(str.replaceAll("\\p{P}", "")); //Removes Special characters only
System.out.println(str.replaceAll("[^a-zA-Z]", "")); //Removes space, Special Characters and digits
System.out.println(str.replaceAll("[^a-zA-Z\\s]", "")); //Removes Special Characters and Digits
System.out.println(str.replaceAll("\\s+", "")); //Remove spaces only
System.out.println(str.replaceAll("\\p{Punct}", "")); //Removes Special characters only
System.out.println(str.replaceAll("\\W", "")); //Removes space, Special Characters but not digits
System.out.println(str.replaceAll("\\p{Punct}+", "")); //Removes Special characters only
System.out.println(str.replaceAll("\\p{Punct}|\\d", "")); //Removes Special Characters and Digits
答案 1 :(得分:2)
我能够使用此功能获得您正在寻找的输出。我不确定你是否需要多个空格是单个空格,这就是为什么我添加了第二个调用来替换all以将多个空格转换为单个空格的原因。
public class RemovePunctuation {
public static void main(String[] args) {
String input = "Testing, testing, 1, one, 2, two, 3, three.";
String alpha = input.replaceAll("[^a-zA-Z\\s]", "").replaceAll("\\s+", " ");
System.out.println(alpha);
}
}
此方法输出:
Testing testing one two three
如果您希望每个单词的第一个字符大写(就像您在问题中所示),那么您可以这样做:
public class Foo {
public static void main(String[] args) {
String input = "Testing, testing, 1, one, 2, two, 3, three.";
String alpha = input.replaceAll("[^a-zA-Z\\s]", "").replaceAll("\\s+", " ");
System.out.println(alpha);
StringBuilder upperCaseWords = new StringBuilder();
String[] words = alpha.split("\\s");
for(String word : words) {
String upperCase = Character.toUpperCase(word.charAt(0)) + word.substring(1) + " ";
upperCaseWords.append(upperCase);
}
System.out.println(upperCaseWords.toString());
}
}
哪个输出:
Testing testing one two three
Testing Testing One Two Three
答案 2 :(得分:1)
我认为Java支持
\p{Punct}
删除所有标点符号