我需要一些问题的帮助。 我试图从文本文件加载我的2000代理列表,但我的类只填充1040个数组索引与每行读取的内容。
我不知道该怎么做。 :(
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class ProxyLoader {
private String[] lineSplit = new String[100000];
private static String[] addresses = new String[100000];
private static int[] ports = new int[100000];
public int i = 0;
public ProxyLoader() {
readData();
}
public synchronized String getAddr(int i) {
return this.addresses[i];
}
public synchronized int getPort(int i) {
return this.ports[i];
}
public synchronized void readData() {
try {
BufferedReader br = new BufferedReader(
new FileReader("./proxy.txt"));
String line = "";
try {
while ((line = br.readLine()) != null) {
lineSplit = line.split(":");
i++;
addresses[i] = lineSplit[0];
ports[i] = Integer.parseInt(lineSplit[1]);
System.out.println("Line Number [" + i + "] Adr: "
+ addresses[i] + " Port: " + ports[i]);
}
for (String s : addresses) {
if (s == null) {
s = "127.0.0.1";
}
}
for (int x : ports) {
if (x == 0) {
x = 8080;
}
}
} catch (IOException e) {
e.printStackTrace();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
答案 0 :(得分:1)
让我们从整理你的代码开始,有很多问题可能会给你带来麻烦。但是,如果没有代理文件的相关部分,我们就无法测试或复制您所看到的行为。考虑创建和发布SSCCE,而不仅仅是代码段。
synchronized
- 在多线程环境中从数组中读取是安全的,并且永远不应该构建ProxyLoader
的多个实例不同的主题,synchronized
上的readData()
就是浪费。ArrayList
或Map
。public int i
变量很危险 - 可能是您使用它来表示加载的最大行数,但应该避免使用此代替size()
方法 - 作为公共实例变量,使用该类的任何人都可以更改此值,而i
是变量的名称不佳,max
是更好的选择。readData()
公开,因为多次调用它会做很奇怪的事情(它会再次加载文件,从i
开始,填充数组重复数据)。最好的想法是直接在构造函数中加载数据(或者在构造函数调用的private
方法中),这样文件只会为每个创建的ProxyLoader
实例加载一次。lineSplit
,然后将其替换为String.split()
的结果。这是令人困惑和浪费的,使用局部变量代替保持分割线。我建议以下实施:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
public class ProxyLoader implements Iterable<ProxyLoader.Proxy> {
// Remove DEFAULT_PROXY if not needed
private static final Proxy DEFAULT_PROXY = new Proxy("127.0.0.1", 8080);
private static final String DATA_FILE = "./proxy.txt";
private ArrayList<Proxy> proxyList = new ArrayList<>();
public ProxyLoader() {
// Try-with-resources ensures file is closed safely and cleanly
try(BufferedReader br = new BufferedReader(new FileReader(DATA_FILE))) {
String line;
while ((line = br.readLine()) != null) {
String[] lineSplit = line.split(":");
Proxy p = new Proxy(lineSplit[0], Integer.parseInt(lineSplit[1]));
proxyList.add(p);
}
} catch (IOException e) {
System.err.println("Failed to open/read "+DATA_FILE);
e.printStackTrace(System.err);
}
}
// If you request a positive index larger than the size of the file, it will return
// DEFAULT_PROXY, since that's the behavior your original implementation
// essentially did. I'd suggest deleting DEFAULT_PROXY, having this method simply
// return proxyList.get(i), and letting it fail if you request an invalid index.
public Proxy getProxy(int i) {
if(i < proxyList.size()) {
return proxyList.get(i);
} else {
return DEFAULT_PROXY;
}
}
// Lets you safely get the maximum index, without exposing the list directly
public int getSize() {
return proxyList.size();
}
// lets you run for(Proxy p : proxyLoader) { ... }
@Override
public Iterator<Proxy> iterator() {
return proxyList.iterator();
}
// Inner static class just to hold data
// can be pulled out into its own file if you prefer
public static class Proxy {
// note these values are public; since they're final, this is safe.
// Using getters is more standard, but it adds a lot of boilerplate code
// somewhat needlessly; for a simple case like this, public final should be fine.
public final String address;
public int port;
public Proxy(String a, int p) {
address = a;
port = p;
}
}
}
答案 1 :(得分:1)
我已经包含了一些可能不完全适合您的用例的示例,但是展示了一些编写代码的方法,这些代码更易于维护和阅读。
难以阅读的代码,难以调试和维护。
Java 7和8允许您从FileSystem读取行,因此无需编写大部分代码来开始:
Path thePath = FileSystems.getDefault().getPath(location);
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
如果您必须将大量小文件读入行并且不想使用FileSystem,或者您使用的是Java 6或Java 5,那么您将创建一个实用程序类,如下所示:
public class IOUtils {
public final static String CHARSET = "UTF-8";
...
public static List<String> readLines(File file) {
try (FileReader reader = new FileReader(file)) {
return readLines(reader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
调用带读取器的readLines:
public static List<String> readLines(Reader reader) {
try (BufferedReader bufferedReader = new BufferedReader(reader)) {
return readLines(bufferedReader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
调用带有BufferedReader的readLines:
public static List<String> readLines(BufferedReader reader) {
List<String> lines = new ArrayList<>(80);
try (BufferedReader bufferedReader = reader) {
String line = null;
while ( (line = bufferedReader.readLine()) != null) {
lines.add(line);
}
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
return lines;
}
Apache有一组名为Apache commons(http://commons.apache.org/)的实用程序。它包括lang,它包括IO utils(http://commons.apache.org/proper/commons-io/)。如果您使用的是Java 5或Java 6,那么这些中的任何一个都会很好。
回到我们的示例,您可以将任何位置转换为行列表:
public static List<String> readLines(String location) {
URI uri = URI.create(location);
try {
if ( uri.getScheme()==null ) {
Path thePath = FileSystems.getDefault().getPath(location);
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
} else if ( uri.getScheme().equals("file") ) {
Path thePath = FileSystems.getDefault().getPath(uri.getPath());
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
} else {
return readLines(location, uri);
}
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
FileSystem,Path,URI等都在JDK中。
继续举例:
private static List<String> readLines(String location, URI uri) throws Exception {
try {
FileSystem fileSystem = FileSystems.getFileSystem(uri);
Path fsPath = fileSystem.getPath(location);
return Files.readAllLines(fsPath, Charset.forName("UTF-8"));
} catch (ProviderNotFoundException ex) {
return readLines(uri.toURL().openStream());
}
}
上面尝试从FileSystem读取uri,如果无法加载它,那么它会通过URL流查找它。 URL,URI,文件,文件系统等都是JDK的一部分。
要将URL流转换为Reader,然后转换为字符串,我们使用:
public static List<String> readLines(InputStream is) {
try (Reader reader = new InputStreamReader(is, CHARSET)) {
return readLines(reader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
:)
现在让我们回到我们的示例(我们现在可以从包括文件在内的任何地方读取行):
public static final class Proxy {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy> loadProxies() {
List <String> lines = IOUtils.readLines(DATA_FILE);
List<Proxy> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
请注意,我们没有任何不可变状态。这可以防止错误。它使您的代码更容易调试和支持。
注意我们的IOUtils.readLines读取文件系统中的行。
注意构造函数中的额外工作,以确保没有人初始化具有错误状态的Proxy实例。这些都在JDK对象,模式等中。
如果你想要一个可重复使用的ProxyLoader,它看起来像这样:
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
...
}
public static class Proxy {
private final String address;
private final int port;
...
public Proxy(String address, int port) {
...
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
编码很棒。测试是神圣的!以下是该示例的一些测试。
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
}
public static class Proxy {
private final String address;
private final int port;
public Proxy(String address, int port) {
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
这是一个类中的替代方案。 (我在ProxyLoader中没有看到太多意义。)
public static final class Proxy2 {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy2(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy2 createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy2(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy2> loadProxies() {
List <String> lines = IO.readLines(DATA_FILE);
List<Proxy2> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
现在我们编写测试(测试和TDD帮助您解决这些问题):
@Test public void proxyTest() {
List<Proxy> proxyList = ProxyLoader.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
idx等在我自己的helper lib中定义,名为boon。 idx方法的工作方式类似于Python或Ruby切片表示法。
@Test public void proxyTest2() {
List<Proxy2> proxyList = Proxy2.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
我的输入文件
127.0.0.1:8080
192.55.55.55:9090
127.0.0.2:8080
192.55.55.56:9090
192.55.55.57:9091
那么我的IOUtils(实际上称为IO):
以下是那些关心IO(utils)的人的测试:
package org.boon.utils;
import com.sun.net.httpserver.Headers;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
import org.junit.Test;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.URI;
import java.util.*;
import java.util.regex.Pattern;
import static javax.lang.Integer.parseInt;
import static org.boon.utils.Lists.idx;
import static org.boon.utils.Lists.len;
import static org.boon.utils.Maps.copy;
import static org.boon.utils.Maps.map;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
...
这可以让您了解所涉及的进口。
public class IOTest {
....
这是一个从文件系统上的文件中读取行的测试。
@Test
public void testReadLines() {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
List<String> lines = IO.readLines(testFile);
assertLines(lines);
}
这是一个断言正确读取文件的辅助方法。
private void assertLines(List<String> lines) {
assertEquals(
4, len(lines)
);
assertEquals(
"line 1", idx(lines, 0)
);
assertEquals(
"grapes", idx(lines, 3)
);
}
这是一个测试,显示从String路径读取文件。
@Test
public void testReadLinesFromPath() {
List<String> lines = IO.readLines("src/test/resources/testfile.txt");
assertLines(lines);
}
此测试显示从URI读取文件。
@Test
public void testReadLinesURI() {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
URI uri = testFile.toURI();
//"file:///....src/test/resources/testfile.txt"
List<String> lines = IO.readLines(uri.toString());
assertLines(lines);
}
这是一个测试,显示您可以从HTTP服务器读取文件中的行:
static class MyHandler implements HttpHandler {
public void handle(HttpExchange t) throws IOException {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
String body = IO.read(testFile);
t.sendResponseHeaders(200, body.length());
OutputStream os = t.getResponseBody();
os.write(body.getBytes(IO.CHARSET));
os.close();
}
}
这是HTTP服务器测试(用于解释HTTP服务器)。
@Test
public void testReadFromHttp() throws Exception {
HttpServer server = HttpServer.create(new InetSocketAddress(9666), 0);
server.createContext("/test", new MyHandler());
server.setExecutor(null); // creates a default executor
server.start();
Thread.sleep(1000);
List<String> lines = IO.readLines("http://localhost:9666/test");
assertLines(lines);
}
以下是代理缓存测试:
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
}
public static class Proxy {
private final String address;
private final int port;
public Proxy(String address, int port) {
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
public static final class Proxy2 {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy2(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy2 createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy2(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy2> loadProxies() {
List <String> lines = IO.readLines(DATA_FILE);
List<Proxy2> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
@Test public void proxyTest() {
List<Proxy> proxyList = ProxyLoader.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
这是实际的代理缓存测试:
@Test public void proxyTest2() {
List<Proxy2> proxyList = Proxy2.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
}
您可以在此处查看此示例的所有源代码和此实用程序类:
https://github.com/RichardHightower/boon
https://github.com/RichardHightower/boon/blob/master/src/main/java/org/boon/utils/IO.java
或者来看我: