当前,在运行ListFiles()之后,我获得了从目录中提取的文件名列表,并要求将其作为输入,以下是我获得的xml文件。
我获得文件名列表的代码是:
String dirPath = "D:\\Input_Split_xml";
File dir = new File(dirPath);
String[] files = dir.list();
for (String aFile : files)
{
System.out.println("file names are "+aFile);
}
Currently all the File names are stored in "aFile" :
file names are 51090323-005_low_level.xml
file names are 90406990_low_level.xml
file names are 90406991_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADBOM_90406990_low_level_BOM.xml
file names are TC_CADBOM_90406991_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
file names are TC_CADDESIGN_90406990_low_level.xml
file names are TC_CADDESIGN_90406991_low_level.xml
现在,我需要按照以下方式对这些文件名进行排序,以将它们视为解析xml文件的输入。
1)对于Ex:基于“ 51090323-005”编号,我需要将位于该编号之下的所有文件名分组,然后依次将它们作为输入,并使用它来获取每个xml的节点数。即 这些是该数字下的3种XML,因此我将收集所有这些XML并一个接一个地使用它们。
a)51090323-005_low_level.xml
b)TC_CADBOM_51090323-005_low_level_BOM.xml
c)TC_CADDESIGN_51090323-005_low_level.xml
专家需要您的帮助来解决此问题
答案 0 :(得分:1)
此函数返回一个映射,其中每个条目对应于一组相关文件。 多亏了正则表达式,因此很容易验证文件名模式并提取数字部分(请参阅group(1))
// key=number, value=array of matching files, sorted
public static Map<String, File[]> process(String fileLocation) {
Map<String, File[]> fileMap = new HashMap<>();
Pattern startFileNamePattern = Pattern.compile("([0-9-]+)_low_level.xml");
File dir = new File(fileLocation);
File[] startFiles = dir.listFiles((File file, String name) -> startFileNamePattern.matcher(name).matches());
for (File f : startFiles) {
Matcher m = startFileNamePattern.matcher(f.getName());
if (m.matches()) {
String number = m.group(1);
File[] allFiles = dir.listFiles((File arg0, String name) -> name.contains(number));
Arrays.sort(allFiles);
fileMap.put(number, allFiles);
}
}
return fileMap;
}
答案 1 :(得分:0)
将String[] files
转换为List
,并删除不包含数字的条目。
List<String> fileNames = Arrays.asList(files);
public static List<String> groupFiles(String number, List<String> fileNames){
fileNames.removeIf(n -> (!n.contains(number)));
return fileNames;
}
输出:
[51090323-005_low_level.xml, TC_CADBOM_51090323-005_low_level_BOM.xml, TC_CADDESIGN_51090323-005_low_level.xml]
此外,如果您需要以编程方式获取数字,则可以使用类似的内容:
public static List<String> getNumbers(List<String> fileNames){
List<String> numbers = new ArrayList<>();
fileNames.removeIf(n -> (!Character.isDigit(n.substring(0, 1).charAt(0))));
fileNames.forEach(name -> {
numbers.add(name.substring(0, 7));
});
return numbers;
}
输出:
[5109032, 9040699, 9040699]
这将从数组中删除不是以数字开头的文件,然后从其余文件中获取8个字符的子字符串。
答案 2 :(得分:0)
添加到Cray的答案中。您可以使用
获取数字String prefix = aFile.split("_")[0];
if (Character.isDigit(prefix.charAt(0))) {
// prefix contains a number that we can filter.
}
答案 3 :(得分:0)
for (String aFile : files)
{
if(aFile.contains("51090323-005")) {
System.out.println("file names are " + aFile);
}
}
Output:
file names are 51090323-005_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
// Extract the numbers
// This HashSet will contain all the numbers. HashSet -> To avoid duplicate numbers
Set<String> baseFiles = new HashSet<>();
System.out.println("Files numbers:");
// Iterate all files to extract the numbers
// Assumption: The base file have the number at beginning, so we will use a pattern that try to match numbers at the beginning of the name
for (String aFile : files)
{
// Create a pattern that match the strings that have at the beginning numbers and/or -
// "matcher" will split the string in groups based on the given pattern
Matcher matcher = Pattern.compile("^([0-9-]+)(.*)").matcher(aFile);
// Verify if the string has the wanted pattern
if(matcher.matches()) {
// Group 0 is the original string
// Group 1 is the number
// Group 2 the rest of the filename
String number = matcher.group(1);
System.out.println(number);
// Add the number to the HashSet
baseFiles.add(number);
}
}
// Iterate all the numbers to create the groups
for (String baseFile : baseFiles)
{
System.out.println("Group " + baseFile);
// Search the filenames that contain the given number
for (String aFile : files)
{
// Verify if the current filename has the given number
if(aFile.contains(baseFile)) {
System.out.println("file names are " + aFile);
}
}
}
Output:
Files numbers:
51090323-005
90406990
90406991
Group 90406991
file names are 90406991_low_level.xml
file names are TC_CADBOM_90406991_low_level_BOM.xml
file names are TC_CADDESIGN_90406991_low_level.xml
Group 51090323-005
file names are 51090323-005_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
Group 90406990
file names are 90406990_low_level.xml
file names are TC_CADBOM_90406990_low_level_BOM.xml
file names are TC_CADDESIGN_90406990_low_level.xml