如何从.txt文件中选择所需的值?

时间:2016-10-30 01:43:01

标签: java file knn

包含哥伦比亚麦德林205套公寓的特征(estrato,大小,年龄,卧室,浴室,地板和标签)的txt文件,我需要加载文件并将每个特征保存在自己的特定变量中,以便我可以通过转换方法传递它们以标准化数据,以便稍后用于KNN分类的距离公式。我有当前的代码:

   import java.util.*;
import java.io.*;
/**
 * collection of properties
 * using arraylist
 */
public class Apartments
{
    ArrayList<Property> apartmentList = new ArrayList<>();
    String estrato = " ";
    String size = " ";
    String age = " ";
    String beds = " ";
    String baths = " ";
    String floor = " ";
    String label = " ";

    public void convertFile () throws FileNotFoundException {
        Scanner input = new Scanner(new      File("ApartmentsFullList.txt"));
        while(input.hasNextLine()){
            String line = input.nextLine();
            Scanner lineScan = new Scanner(line);
            for(int i = 0; i < line.length(); i++){
                char start = line.charAt(i);
                if(start == '{'){
                    estrato = lineScan.next();
                    size = lineScan.next();
                    age = lineScan.next();
                    beds = lineScan.next();
                    baths = lineScan.next();
                    floor = lineScan.next();
                }
            }
            System.out.println(estrato + " " + size + " " + age + " " +      beds + " " +
               baths + " " + floor + " " + label);
        }
    }
}

和以下.txt文件:

01-Los Balsos 1380m {6, 430,    4,  3,  4,  22, Luxury},
02-El Tesoro 1100m  {6, 263,    2,  3,  4,  21, Luxury},
03-Las Lomas 1030m  {6, 270,    2,  3,  4,  16, Luxury},
04-Las Lomas 1020m  {6, 270,    2,  3,  6,  18, Luxury},
05-Las Lomas 2600m  {6, 780,    3,  7,  9,  33, Luxury},
06-Los Balsos 1100m {6, 380,    3,  4,  5,  14, Luxury},
07-Laureles 1350m   {6, 494,    2,  3,  5,  17, Luxury},
08-Los Balsos 1100m {6, 445,    3,  4,  5,  10, Luxury},
09-Alejandria 1584m {6, 288,    1,  3,  4,  25, Luxury},
10-La Florida 1200m {6, 425,    3,  4,  6,  12, Luxury},
11-San Lucas 1200m  {6, 200,    1,  3,  4,  19, Luxury},
12-Los Balsos 1250m {6, 347,    3,  4,  6,  23, Luxury},
13-Las Lomas    1300m   {6, 350,    2,  6,  6,  9,  Luxury},
14-El Tesoro 1350m  {6, 240,    1,  3,  5,  11, Luxury},
15-San Lucas 2800m  {6, 760,    1,  5,  8,  31, Luxury},
16-El Tesoro 1300m  {6, 315,    2,  4,  5,  8,  Luxury},
17-Las Palmas 1820m {6, 318,    1,  4,  7,  24, Luxury},
18-Las Lomas 1500m  {6, 429,    1,  3,  5,  13, Luxury},
19-El Tesoro 2000m  {6, 500,    3,  5,  7,  11, Luxury},
20-San Lucas 1800m  {6, 603,    3,  5,  5,  17, Luxury},
21-El Tesoro 1300m  {6, 315,    1,  3,  5,  5,  Luxury},
22-El Tesoro 1300m  {6, 294,    1,  4,  5,  16, Luxury},
23-San Lucas 2400m  {6, 300,    1,  5,  5,  21, Luxury},
24-El Poblado 1180m {6, 382,    2,  3,  4,  12, Luxury},
25-Castropol 1190m  {6, 302,    1,  3,  4,  26, Luxury},
26-El Poblado 2500m {6, 500,    1,  5,  8,  30, Luxury},
27-La Calera 1600m  {6, 520,    3,  7,  7,  21, Luxury},
28-Castropol 1190m  {6, 302,    1,  4,  4,  32, Luxury},
29-Las Lomas 1530m  {6, 338,    2,  3,  5,  24, Luxury},
30-Superior 5100m   {6, 578,    1,  4,  6,  27, Luxury},    
31-Alejandria 930m  {6, 186,    1,  4,  4,  19, Upper},
32-Lleras 940m  {6, 175,    2,  2,  3,  11, Upper},
33-El Poblado 900m  {6, 181,    1,  3,  3,  6,  Upper},
34-San Lucas 800m   {6, 175,    2,  3,  4,  7,  Upper},
35-El Poblado 890m  {5, 175,    3,  2,  3,  5,  Upper},
36-San Lucas 900m   {6, 148,    1,  3,  4,  19, Upper},
37-El Tesoro 800m   {6, 180,    2,  3,  5,  21, Upper},
38-Los Balsos 930m  {6, 186,    2,  3,  4,  12, Upper},
39-Los Balsos 990m  {5, 300,    4,  3,  4,  4,  Upper},
40-El Poblado 960m  {6, 153,    1,  2,  3,  12, Upper},
41-El Poblado 900m  {6, 186,    2,  3,  4,  6,  Upper},
42-El Poblado 970m  {6, 225,    4,  3,  5,  2,  Upper},
43-San Lucas 920m   {5, 168,    1,  3,  4,  4,  Upper},
44-Laureles 920m    {5, 230,    3,  3,  5,  22, Upper},
45-Superior 880m    {5, 227,    2,  3,  4,  12, Upper},
46-El Poblado 980m  {5, 298,    4,  4,  5,  11, Upper},
47-La Calera 890m   {6, 251,    2,  3,  3,  9,  Upper},
48-Los Balsos 950m  {6, 285,    3,  4,  7,  3,  Upper},
49-San Lucas 860m   {6, 162,    1,  3,  4,  13, Upper},
50-Envigado 750m    {5, 153,    1,  3,  3,  7,  Upper},
51-M de Oro 830m    {6, 209,    2,  3,  5,  18, Upper},
52-Superior 850m    {6, 220,    2,  4,  4,  15, Upper},
53-Inferior 840m    {6, 171,    1,  4,  5,  14, Upper},
54-Inferior 860m    {5, 300,    4,  4,  6,  8,  Upper},
55-San Lucas 850m   {6, 175,    3,  3,  3,  11, Upper},
56-El Poblado 680m  {6, 290,    3,  4,  3,  17, Upper},
57-Tomatera 805m    {6, 251,    2,  4,  4,  4,  Upper},
58-El Poblado 850m  {6, 260,    2,  4,  4,  7,  Upper},
59-Vizcaya 875m {5, 265,    4,  3,  5,  16, Upper},
60-Castropol 750m   {5, 298,    4,  4,  5,  11, Upper},
61-Los Balsos 550m  {5, 113,    1,  2,  3,  13, UM},
62-El Poblado 599m  {6, 130,    1,  3,  2,  3,  UM},
63-El Poblado 550m  {6, 120,    1,  3,  3,  7,  UM},
64-La Linde 520m    {6, 109,    1,  2,  3,  5,  UM},
65-El Poblado 560m  {5, 120,    2,  2,  3,  7,  UM},
66-El Poblado 530m  {6, 172,    4,  3,  4,  8,  UM},
67-El Poblado 610m  {6, 144,    2,  3,  4,  7,  UM},
68-El Tesoro 555m   {6, 152,    2,  3,  4,  11, UM},
69-Villa Paula 520m {6, 165,    3,  3,  4,  1,  UM},
70-La Calera 630m   {6, 135,    1,  4,  4,  9,  UM},
71-Simesa 521m  {6, 124,    2,  3,  2,  16, UM},
72-El Tesoro 550m   {6, 215,    4,  3,  3,  2,  UM},
73-La Linde 520m    {6, 160,    3,  3,  4,  4,  UM},
74-La Florida 650m  {5, 290,    4,  4,  4,  1,  UM},
75-El Poblado 650m  {6, 187,    2,  4,  4,  8,  UM},
76-San Lucas 520m   {6, 277,    4,  3,  4,  1,  UM},
77-Castropol 550m   {6, 117,    1,  3,  4,  5,  UM},
78-San Lucas 570m   {6, 140,    3,  4,  4,  7,  UM},
79-Astorga 650m {6, 187,    2,  3,  5,  7,  UM},
80-El Tesoro 545m   {6, 136,    2,  2,  3,  6,  UM},
81-Las Lomas 590m   {6, 178,    3,  3,  4,  10, UM},
82-San Lucas 650m   {6, 128,    1,  2,  3,  1,  UM},
83-La Florida 650m  {6, 230,    4,  4,  4,  8,  UM},
84-Alejandria 600m  {6, 210,    3,  3,  4,  1,  UM},
85-Las Lomas 550m   {5, 139,    1,  3,  3,  9,  UM},
86-Los Parra 545m   {6, 125,    1,  2,  3,  14, UM},
87-El Poblado 500m  {6, 239,    4,  3,  3,  1,  UM},
88-Alejandria 595m  {6, 113,    1,  2,  3,  4,  UM},
89-Castellana 600m  {5, 200,    4,  3,  3,  2,  UM},
90-Envigado 540m    {5, 125,    1,  2,  3,  7,  UM},
91 – {5,    109,    3,  4,  3,  8,  Middle},
92- {5,     97, 3,  3,  3,  9,  Middle},
93- {4,     93, 4,  3,  2,  6,  Middle},
94- {6,     126,    3,  2,  3,  6,  Middle},
95- {6,     124,    4,  3,  2,  1,  Middle},
96- {5,     127,    3,  3,  3,  3,  Middle},
97- {5,     115,    4,  2,  1,  2,  Middle},
98- {5,     110,    4,  4,  2,  1,  Middle},
99- {4,     72, 3,  3,  2,  5,  Middle},
100- {5,    71, 1,  2,  1,  10, Middle},
101- {4,    72, 1,  2,  2,  19, Middle},
102- {4,    69, 3,  3,  2,  12, Middle},
103- {4,    90, 1,  3,  3,  24, Middle},
104- {3,    68, 4,  3,  2,  10, Middle},
105- {5,    76, 3,  1,  3,  13, Middle},
106- {6,    142,    4,  3,  4,  1,  Middle},
107- {6,    71, 4,  3,  2,  2,  Middle},
108- {3,    116,    4,  3,  5,  1,  Middle},
109- {6,    55, 1,  2,  2,  3,  Middle},
110- {6,    150,    4,  3,  3,  8,  Middle},
112- {4,    80, 3,  2,  3,  4,  Middle},
113- {3,    62, 3,  3,  2,  2,  Middle},
114- {6,    154,    4,  3,  3,  9,  Middle},
115- {4,    70, 3,  3,  2,  5,  Middle},
116- {6,    119,    4,  3,  4,  3,  Middle},
117- {3,    55, 3,  3,  2,  2,  Middle}
118- {4,    72, 1,  2,  2,  4,  Middle},
119- {5,    170,    3,  4,  3,  8,  Middle}
120- {5,    98, 1,  3,  2,  17, Middle},
121- {5,    75, 4,  2,  2,  4,  Middle},
122- {6,    76, 3,  2,  2,  3,  Middle},
123- {3,    98, 4,  3,  2,  1,  Middle},
124- {4,    64, 1,  2,  2,  9,  Middle},
125- {5,    108,    4,  3,  2,  3,  Middle},
126- {5,    125,    4,  4,  3,  1,  Middle},
127- {5,    124,    3,  3,  3,  3,  Middle},
128- {5,    106,    3,  3,  3,  5,  Middle},
129- {5,    118,    4,  3,  3,  6,  Middle},
130- {5,    88, 3,  3,  3,  1,  Middle},
131- {5,    79, 3,  3,  2,  6,  Middle},
132- {5,    120,    3,  3,  2,  4,  Middle},
133- {6,    78, 1,  2,  2,  29, Middle},
134- {5,    95, 1,  3,  3,  12, Middle},
135- {5,    78, 4,  3,  2,  3,  Middle},
136- {5,    80, 3,  3,  2,  4,  Middle},
137- {5,    78, 3,  2,  2,  1,  Middle},
138- {5,    120,    4,  3,  3,  4,  Middle},
139- {5,    92, 3,  3,  2,  7,  Middle},
140- {5,    110,    3,  3,  3,  2,  Middle},
141- {6,    124,    1,  3,  3,  1,  Middle},
142- {5,    96, 3,  3,  2,  9,  Middle},
143- {5,    98, 4,  3,  3,  3,  Middle},
144- {5,    92, 3,  2,  3,  8,  Middle},
145- {4,    32, 4,  1,  1,  2,  LM},
146- {3,    45, 3,  2,  1,  1,  LM},
147- {3,    50, 2,  3,  1,  2,  LM},
148- {3,    48, 2,  3,  2,  3,  LM},
149- {2,    48, 1,  3,  2,  1,  LM},
150- {2,    48, 1,  3,  2,  2,  LM},
151- {4,    45, 2,  2,  1,  13, LM},
152- {4,    45, 3,  3,  1,  11, LM},
153- {4,    30, 1,  1,  1,  24, LM},
154- {3,    45, 2,  3,  1,  13, LM},
155- {3,    60, 2,  3,  1,  4,  LM},
156- {3,    50, 2,  2,  1,  1,  LM},
157- {3,    48, 2,  3,  1,  3,  LM},
158- {3,    30, 1,  1,  1,  2,  LM},
159- {4,    30, 3,  2,  1,  1,  LM},
160- {3,    37, 1,  2,  1,  11, LM},
161- {3,    54, 2,  3,  2,  10, LM},
162- {3,    42, 1,  3,  1,  4,  LM},
163- {3,    45, 2,  3,  1,  3,  LM},
164- {3,    54, 3,  3,  2,  10, LM},
165- {3,    54, 2,  3,  2,  9,  LM},
166- {3,    54, 1,  2,  1,  4,  LM},
167- {3,    80, 4,  3,  2,  2,  LM},
168- {3,    50, 1,  3,  2,  13, LM},
169- {3,    34, 2,  1,  1,  3,  LM},
170- {3,    55, 1,  3,  1,  1,  LM},
171- {2,    45, 1,  2,  1,  5,  LM},
172- {3,    42, 3,  2,  1,  12, LM},
173- {3,    39, 1,  1,  1,  3,  LM},
174- {4,    45, 3,  2,  1,  7,  LM},
175- {3,    38, 1,  2,  1,  3,  LM},
176- {2,    42, 1,  2,  1,  4,  Lower},
177- {1,    43, 1,  3,  1,  4,  Lower},
178- {2,    25, 1,  1,  1,  1,  Lower},
179- {2,    42, 1,  2,  1,  1,  Lower},
180- {1,    49, 1,  2,  1,  1,  Lower},
181- {1,    46, 2,  3,  1,  2,  Lower},
182- {2,    18, 1,  1,  1,  4,  Lower},
183- {2,    45, 3,  3,  1,  2,  Lower},
184- {3,    18, 1,  1,  1,  3,  Lower},
185- {1,    40, 2,  2,  1,  2,  Lower},
186- {2,    40, 2,  2,  1,  4,  Lower},
187- {2,    41, 2,  3,  1,  2,  Lower},
188- {3,    50, 3,  2,  1,  1,  Lower},
189- {1,    44, 4,  2,  1,  1,  Lower},
190- {1,    55, 3,  3,  1,  2,  Lower},
191- {2,    17, 1,  1,  1,  1,  Lower},
192- {1,    40, 3,  2,  1,  1,  Lower},
193- {1,    41, 4,  2,  1,  4,  Lower},
194- {1,    40, 2,  2,  1,  3,  Lower},
195- {3,    16, 1,  1,  1,  2,  Lower},
196- {3,    45, 1,  2,  1,  1,  Lower},
197- {3,    48, 1,  3,  1,  2,  Lower},
198- {2,    43, 1,  3,  1,  4,  Lower},
199- {1,    50, 4,  2,  1,  1,  Lower},
200- {1,    60, 4,  2,  1,  1,  Lower},
201- {1,    40, 1,  2,  1,  2,  Lower},
202- {3,    32, 1,  1,  1,  1,  Lower},
203- {2,    44, 1,  2,  1,  3,  Lower},
204- {1,    40, 4,  2,  1,  1,  Lower},
205- {2,    36, 2,  2,  1,  5,  Lower}

但是当我打印时我得到所有文本,我需要的就是大括号内的内容

如果有人可以在我的具体案例中回答我的问题和/或对KNN java实施有任何指导,那将会有很大的帮助! 感谢

1 个答案:

答案 0 :(得分:-1)

删除此行

if (start == '{') {
    Scanner lineScan = new Scanner(line.substring(i));
    estrato = lineScan.next();
    size = lineScan.next();
    age = lineScan.next();
    beds = lineScan.next();
    baths = lineScan.next();
    floor = lineScan.next();
    break; // consider adding a break here
}

将其移至

{

问题是你在搜索{{1}}字符所在的位置之前就已经为整行做了扫描仪。