每列代表大量数据的不同变量。我试图提取每个数字并将其放置在每一行的数组中。
下划线表示间距
2 ___ 2 ___ 2 _______ 3 ___ 1 ___ 19
1 ___ 3 ___ 2 _______ 3 ___ 3 ___ 19
1 ___ 3 ___ 4 _______ 3 ___ 1 ___ 19
6 ___ 3 ___ 6 _______ 5 _______ 13
5 ___ 2 ___ 5 _______ 5 _______ 13
5 ___ 4 ___ 4 ___ 7 ___ 4 _______ 13
spaceForNew表示在找到下一个变量之前还剩下多少个字符。这与当前变量不同。
我正在使用以下代码:
public static int[] remaining(String Line)throws IOException
{
int[] data = new int[7];
int pointer = 0;
int spaceForNew = 0;
for(int i = 0;i<=Line.length()-1;i++)
{
if(i<Line.length()-1)
{
if((i == spaceForNew)&&(pointer<6))
{
//two digit
if((Line.charAt(i)=='1')&&(Line.charAt(i+1)=='0'))
{
data[pointer] = 10;
spaceForNew+=3;
pointer++;
//one digit
}else if((Line.charAt(i)!= ' ')&&(Line.charAt(i+1)!='0')){
data[pointer] = Integer.parseInt(Character.toString(Line.charAt(i)));
spaceForNew+=2;
pointer++;
}else if((Line.charAt(i)==' ')&&(data[pointer]==0)){
data[pointer]=-1;
spaceForNew++;
pointer++;
}
}
}else {
if(pointer==6)
{
data[pointer]=Integer.parseInt(Character.toString(Line.charAt(i)));
}
}
}
return data;
}
以下代码令人毛骨悚然,不是很直观,但是似乎可以处理大量数据,但是失败的方式似乎是随机的。根本没有任何建议
答案 0 :(得分:0)
UPD 尝试
String line = "10 8 10 1 8";
String[] split = line.split(" ");
int[] array = new int[7];
for (int i = 0; i < split.length; i++) {
array[i] = split[i].trim().isEmpty() ? -1 : Integer.parseInt(split[i].trim());
}
答案 1 :(得分:0)
您可以使用正则表达式来解析行
(\d+| )(?: )?
这基本上是说给我所有数字或一个空格,然后跟或不跟3个空格。
您将获得一个可以解析为数字或为单个空格的字符串列表,您可以将其作为丢失的数据进行处理,但是将成为占位符,因此您可以使列保持直线。
Integer[] parsed = new Integer[7];
String thing = "2 2 2 3 1 19";
Pattern pattern = Pattern.compile("(\\d+| )(?: )?");
Matcher m = pattern.matcher(thing);
int index = 0;
while (m.find()) {
if (!" ".equals(m.group(1)))
parsed[index] = Integer.parseInt(m.group(1));
else
parsed[index] = -1; //or what ever your missing data value should be.
index++;
}
Arrays.asList(parsed).forEach(System.out::println);
edit ***超级固定。 group(0)是整个模式,然后是任何捕获组。因此,group(1)获取第一个捕获组,该捕获组只是数字或单个空格。
答案 2 :(得分:0)
您需要知道每行的确切模式。我假设每个“列”的宽度都是固定的,否则数字就不会像这样对齐。
例如,假设每列的宽度为三个字符(数字和/或空格),并且列分隔符的宽度为1个空格,则您的模式可能如下所示:
[ \d]{3} |[ \d]{1,3}
现在使用Pattern::compile
,Pattern::matcher
和Matcher::find
,您可以搜索当前行中存在的所有数字。假设lines
是List<String>
,每个元素都是一行:
// Precompile pattern. This matches either a cell followed by a space, or,
// if we are at the end of the line, a variable number of spaces and/or
// digits.
Pattern pattern = Pattern.compile("[ \\d]{3} |[ \\d]{1,3}");
List<List<Integer>> matrix = lines.stream()
.map(pattern::matcher)
.map(matcher -> {
List<Integer> ints = new ArrayList<>();
while (matcher.find()) {
String element = matcher.group().trim();
ints.add(!element.isEmpty() ? Integer.valueOf(element) : -1);
}
return ints;
})
.collect(Collectors.toList());
使用MatcherStream
提供的dimo414:
Pattern pattern = Pattern.compile("[ \\d]{3} |[ \\d]{1,3}");
List<List<Integer>> matrix = lines.stream()
.map(line -> MatcherStream.find(pattern, line)
.map(String::trim)
.map(element -> !element.isEmpty() ? Integer.valueOf(element) : -1)
.collect(Collectors.toList()))
.collect(Collectors.toList());
答案 3 :(得分:0)
我想,从理论上讲,在任何给定文件行的空格分隔数据(甚至是连续值)中,任何地方都可能缺少值。这将包括
示例可能是(如您的示例中,下划线表示空格):
2___2___2_______3___1___19
1___3___2_______3___3___19
____3___4_______3___1___19
____5___7___4___3___8____
6___3___6_______5_______13
5___2___5_______________13
5___4___4___7___4_______16
10___6___10___3___8_______1
2___10___0___8___4___0___1
2___10___0___8___4________
4___12___0___9___6
这里的好处是,文件中的数据似乎以 固定空间 模式格式化。知道这一点后,就有可能用特定的整数值替换丢失的值,该整数值将与每个文件数据行中实际包含的其他值相去甚远。我认为“-1” (您使用的是什么)确实可以很好地解决此问题,前提是不必担心处理文件或 -1 < / strong>永远不会成为进一步处理数据的任何真正关注的价值,因为它已被考虑为可能存在。当然,这必须由您决定。
一旦将任何给定数据行中的缺失值替换为 -1 ,就可以根据空格定界来分割行,将数组元素转换为整数,然后将它们放入整数数组中。
如果要将文件数据的每一行(文件行)放入一个整数数组,那么请允许我提出一个二维整数(int [] [])数组。我认为您会发现它更容易处理,因为整个数据文件都可以包含在该特定数组中。然后,允许Java方法创建该数组,例如:
将整个文件逐行读取到String []数组中:
List<String> list = new ArrayList<>();
try (Scanner reader = new Scanner(new File("FileExample.txt"))) {
while (reader.hasNextLine()) {
String line = reader.nextLine();
if (line.equals("")) { continue; }
list.add(line);
}
}
catch (FileNotFoundException ex) {
Logger.getLogger("FILE NOT FOUND!").log(Level.SEVERE, null, ex);
}
// Convert list to String Array
String[] stringData = list.toArray(new String[0]);
FileExample.txt
文件包含与上面提供的数据完全相同的数据,但是文件下划线是空格。一旦运行了上面的代码,名为 stringData 的String [] Array变量将包含所有文件数据行。现在,我们将此数组传递给名为 stringDataTo2DIntArray()的下一个方法(由于缺少更好的名称),以创建2D整数数组( data [] [] ):< / p>
/**
* Creates a 2D Integer (int[][]) Array from data lines contained within the
* supplied String Array.<br><br>
*
* @param stringData (1D String[] Array) The String array where each element
* contains lines of fixed space delimited numerical values, for example each
* line would look something like:<pre>
*
* "2 1 3 4 5 6 7" </pre>
*
* @param replaceMissingWith (String) One or more numerical values could be
* missing from any elemental line within the supplied stringData array. What
* you supply as an argument to this parameter will be used in place of that
* missing value. <br>
*
* @param desiredNumberOfColumns (Integer (int)) The number of columns desired
* in each row of the returned 2D Integer Array. Make sure desiredNumberOfColumns
* contains a value greater than 0 and less then (Integer.MAX_VALUE - 4). You
* will most likely run out of JVM memory if you go that big! Be reasonable,
* although almost any unsigned integer value can be supplied (and you're
* encouraged to test this) the largest number of data columns contained within
* the data file should suffice.<br>
*
* @return (2D Integer (int[][]) Array) A two dimensional Integer Array derived
* from the supplied String Array of fixed space delimited line data.
*/
public int[][] stringDataToIntArray(final String[] stringData,
final String replaceMissingWith, final int desiredNumberOfColumns) {
int requiredArrayLength = desiredNumberOfColumns;
// Make sure the replaceMissingWith parameter actually contains something.
if (replaceMissingWith == null || replaceMissingWith.trim().equals("")) {
System.err.println("stringDataToIntArray() Method Error! The "
+ "replaceMissingWith parameter requires a valid argument!");
return null;
}
/* Make sure desiredNumberOfColumns contains a value greater than 0 and
less then (Integer.MAX_VALUE - 4). */
if (desiredNumberOfColumns < 1 || desiredNumberOfColumns > (Integer.MAX_VALUE - 4)) {
System.err.println("stringDataToIntArray() Method Error! The "
+ "desiredNumberOfColumns parameter requires any value "
+ "from 1 to " + (Integer.MAX_VALUE - 4) + "!");
return null;
}
// The 2D Array to return.
int[][] data = new int[stringData.length][requiredArrayLength];
/* Iterate through each elemental data line contained within
the supplied String Array. Process each line and replace
any missing values... */
for (int i = 0; i < stringData.length; i++) {
String line = stringData[i];
// Replace the first numerical value with replaceMissingWith if missing:
if (line.startsWith(" ")) {
line = replaceMissingWith + line.substring(1);
}
// Replace remaining missing numerical values if missing:
line = line.replaceAll("\\s{4}", " " + replaceMissingWith);
// Split the string of numerical values based on whitespace:
String[] lineParts = line.split("\\s+");
/* Ensure we have the correct Required Array Length (ie: 7):
If we don't then at this point we were missing values at
the end of the input string (line). Append replaceMissingWith
to the end of line until a split satisfies the requiredArrayLength: */
while (lineParts.length < requiredArrayLength) {
line+= " " + replaceMissingWith;
lineParts = line.split("\\s+");
}
/* Fill the data[][] integer array. Convert each string numerical
value to an Integer (int) value for current line: */
for (int j = 0; j < requiredArrayLength; j++) {
data[i][j] = Integer.parseInt(lineParts[j]);
}
}
return data;
}
要使用此方法(一旦您已读取数据文件并将其内容放入字符串数组中):
int[][] data = stringDataToIntArray(stringData, "-1", 7);
// Display the 2D data Array in Console...
for (int i = 0; i < data.length; i++) {
System.out.println(Arrays.toString(data[i]));
}
如果您已经处理了上面提供的示例文件数据,那么控制台输出窗口应包含:
[2, 2, 2, -1, 3, 1, 19]
[1, 3, 2, -1, 3, 3, 19]
[-1, 3, 4, -1, 3, 1, 19]
[-1, 5, 7, 4, 3, 8, -1]
[6, 3, 6, -1, 5, -1, 13]
[5, 2, 5, -1, -1, -1, 13]
[5, 4, 4, 7, 4, -1, 16]
[10, 6, 10, 3, 8, -1, 1]
[2, 10, 0, 8, 4, 0, 1]
[2, 10, 0, 8, 4, -1, -1]
[4, 12, 0, 9, 6, -1, -1]
如果您只希望每个文件行的前三列,那么您的呼叫将是:
int[][] data = stringDataToIntArray(stringData, "-1", 3);
,输出结果如下:
[2, 2, 2]
[1, 3, 2]
[-1, 3, 4]
[-1, 5, 7]
[6, 3, 6]
[5, 2, 5]
[5, 4, 4]
[10, 6, 10]
[2, 10, 0]
[2, 10, 0]
[4, 12, 0]
,如果您希望每个文件行有12个数据列,则您的调用将是:
int[][] data = stringDataToIntArray(stringData, "-1", 12);
,输出结果如下:
[2, 2, 2, -1, 3, 1, 19, -1, -1, -1, -1, -1]
[1, 3, 2, -1, 3, 3, 19, -1, -1, -1, -1, -1]
[-1, 3, 4, -1, 3, 1, 19, -1, -1, -1, -1, -1]
[-1, 5, 7, 4, 3, 8, -1, -1, -1, -1, -1, -1]
[6, 3, 6, -1, 5, -1, 13, -1, -1, -1, -1, -1]
[5, 2, 5, -1, -1, -1, 13, -1, -1, -1, -1, -1]
[5, 4, 4, 7, 4, -1, 16, -1, -1, -1, -1, -1]
[10, 6, 10, 3, 8, -1, 1, -1, -1, -1, -1, -1]
[2, 10, 0, 8, 4, 0, 1, -1, -1, -1, -1, -1]
[2, 10, 0, 8, 4, -1, -1, -1, -1, -1, -1, -1]
[4, 12, 0, 9, 6, -1, -1, -1, -1, -1, -1, -1]
在每个数组结尾处附加的 -1 是因为该方法检测到数据行中不存在这些列,但是由于12是您所需的列数,因此需要附加数据