Question

我目前有一个文本文件，其中包含以下内容：

1 Commercial & Enterprise  5   SLICE    59.99  IP MICRO 
2 Commercial & Enterprise  5   SLICE    59.99  MULTI-USE SWITCH
.
.
.
.
18 Government & Military   6   TCP      15.00  TCP

我正在尝试拆分该行，以便我可以拥有以下内容：

Product number:  18
Category:        Government & Military
Product name:  TCP
Units in stock: 6
Price: $15.00
Total value: $90.00
Fee: $4.50
Total value: $94.50

我目前有以下代码：

while ((line = lineReader.readLine()) != null) {

            StringTokenizer tokens = new StringTokenizer(line, "\t");

            p = new ActionProduct();
            add(p);
            String category = p.getCategory();
            String name = p.getName();
            category = tokens.nextToken();
            int item = p.getItem();
            double price = p.getPrice();
            int units = p.getUnits();

            while (tokens.hasMoreTokens()) {
            item = Integer.parseInt(tokens.nextToken());
            price = Double.parseDouble(tokens.nextToken());
            units = Integer.parseInt(tokens.nextToken());
            }

            System.out.println("Category: " + category);
            System.out.println("Product number:  " + item);
            System.out.println("Product name:  " + name);
            System.out.println("Units in stock: "+ units);
            System.out.println("Price: $" + String.format("%.2f", price)); 
            System.out.println("Total value: $" + String.format("%.2f",p.value()));
            System.out.println("Fee: $" + String.format("%.2f", p.fee()));

            System.out.println("Total value: $" + String.format("%.2f", value()));
        }

我得到的是这个输出：

Category: 1 Commercial & Enterprise  5   SLICE    59.99  IP MICRO             
Product number:  0
Product name:  null
Units in stock: 0
Price: $0.00
Total value: $0.00
Fee: $0.00
Total value: $0.00
Category: 2 Commercial & Enterprise  5   SLICE    59.99  MULTI-USE SWITCH     
Product number:  0
Product name:  null
Units in stock: 0
Price: $0.00
Total value: $0.00
Fee: $0.00
Total value: $0.00

所以我的问题是......我必须做什么才能分割线，以便我可以单独打印我纺织品的每个值？先谢谢你们，真的很感谢一些方向！

这是我的文本文件：

1 Commercial & Enterprise  5   SLICE    59.99  IP MICRO             
2 Commercial & Enterprise  5   SLICE    59.99  MULTI-USE SWITCH     
3 Commercial & Enterprise  4   SLICE    59.99  2100                 
4 Commercial & Enterprise  6   SLICE    59.99  IP                   
5 Commercial & Enterprise  4   HDX      45.00  HYBRID CARRIER       
6 Commercial & Enterprise  10  TRANSip  45.00  IP Technology Suite  
7 Commercial & Enterprise  5   GUI      30.00  LINK COMMAND SYS     
8 Commercial & Enterprise  5   GUI      30.00  MAUI                 
9 Commercial & Enterprise  6   RCP      20.00  RCP                  
10 Government & Military   5   SLICE    60.00  IP MICRO             
11 Government & Military   5   SLICE    60.00  MULTI-USE SWITCH     
12 Government & Military   4   SLICE    60.00  2100                 
13 Government & Military   6   SLICE    55.00  IP                   
14 Government & Military   4   HDX.C    35.00  HYBRID CARRIER       
15 Government & Military   10  TRANSip  30.00  IP Technology Suite  
16 Government & Military   5   GUI      20.00  LINK COMMAND SYS     
17 Government & Military   5   GUI      20.00  MAUI                 
18 Government & Military   6   TCP      15.00  TCP

Answer 1

由于您希望根据任意模式拆分文本，因此完全是RegEx的用途;使用RegEx解析器来标记输入，然后根据需要处理标记。

简单地说，你读取文件，将其传递给RegEx tokenizer，然后处理令牌（即字符串）

您的数据的示例正则表达式模式将是

[0-9] + [\ S] + [A-ZA-Z \ S \ Q＆安培; \ E] + [\ S] + [0-9] + [\ S] + [A-ZA- Z] + [\ S] + [0-9 \ Q. \ E] + [\ S] + [A-ZA-Z0-9] +

您可以使用例如

快速有效地创建模式

http://gskinner.com/RegExr/

进一步阅读：

http://en.wikipedia.org/wiki/Regular_expression

http://docs.oracle.com/javase/tutorial/essential/regex/

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html

Answer 2

仔细查看数据。您是否获得了更多数据，或者这是唯一的文件？

如果你获得了更多的数据，你需要有一些规格，所以你可以肯定，你的解析器将继续工作。

如果您有固定的数据定位，那么您可以使用

String part = line.substring(beginIndex, endIndex)

此数据文件几乎是固定位置，除非产品编号增加..

相反，您可以尝试使用regex或line.split（分隔符）

在你真正理解之前不要使用正则表达式。

如果这是唯一的文件，我想我会从

开始

String[] parts = line.split("  ") //two spaces

然后从你得到的字符串数组解析。

第一部分，零件[0]，既包含产品编号又包含类别，但您也可以将其拆分。

如何在读/写中拆分tex文件中的一行？

2 个答案: