我在文本文件中有多个JSON对象可用,我试图从每个对象中获取数据以获取一些数据并将其存储在我的数据库中。以下是我的数据样本:
{"address":"4565 S Wayside Dr","categories":["Other Textile Goods","Textile Manufacturers"],"city":"Houston","country":"US","dateAdded":"2016-11-17T22:36:43Z","dateUpdated":"2017-09-12T14:29:52Z","keys":["us/tx/houston/4565swaysidedr/-1836686262"],"latitude":"29.6981","longitude":"-95.3212","name":"Radium Textiles LLC","phones":["7136901081"],"postalCode":"77087","province":"TX","sourceURLs":["http://www.citysearch.com/profile/607780624/houston_tx/radium_textiles_llc.html"],"id":"AVwdH8-KkufWRAb52ixf"}
{"address":"6104 Donoho St","categories":["Wholesale Textile Brokers","Textile Brokers"],"city":"Houston","country":"US","dateAdded":"2017-03-26T19:08:42Z","dateUpdated":"2017-03-26T19:08:42Z","imageURLs":["http://images1.citysearch.net/assets/guide/images/logo_citysearch_130x25.gif"],"keys":["us/tx/houston/6104donohost/-214331342"],"latitude":"29.677891","longitude":"-95.324173","name":"T A Textiles","phones":["7136452800"],"postalCode":"77033","province":"TX","sourceURLs":["http://www.superpages.com/bp/houston-tx/t-a-textiles-L2170967950.htm","http://houston.citysearch.com/profile/647921770/houston_tx/t_a_textiles.html"],"id":"AVwdbMI6IN2L1WUfvriy"}
{"address":"4544 S Pinemont Dr","categories":["Other Fabricated Textile Product Manufacturers","Textile Manufacturers","Other Textile Goods"],"city":"Houston","country":"US","dateAdded":"2016-11-17T22:33:12Z","dateUpdated":"2017-09-12T14:29:50Z","keys":["us/tx/houston/4544spinemontdr/-1836686262"],"latitude":"29.8369","longitude":"-95.5160","name":"Radium Textiles LLC","neighborhoods":["Fairbanks/ Northwest Crossing , Northwest Houston"],"phones":["7136901390"],"postalCode":"77041","province":"TX","sourceURLs":["http://www.citysearch.com/profile/694636660/houston_tx/radium_textiles_llc.html","http://www.yellowpages.com/houston-tx/mip/radium-textiles-456243882?lid=456243882"],"websites":["http://api.citygridmedia.com/content/places/v2/click?q=9YKflVKbY9NauPJdMy0B1gS1IhB4xv4EWw0zDoT-UWc_izWF3zs5PKGdfOHubWrvM0QwDCYwbOH2fdLi0dK5xArULcksCCbfR-WWAz9xD1AmGVAQZIom4U3n5R4DuRC8WJCtvJcNItEKoCSfzwapuGnmwGnHDpEGYXGjnN4u8zXqkiimSHFf4_dbqGRbVgNJczcRYGsO7BQjsEDjdlUTJ3CxVQB3K1438yd7WPe-AAAIJEq588kBWNDLbak0Vs-EUxvQmWKBKxWI5ahci9eDn5KNvXpHpqZUL_e0UVacwelpEs92aC0Q2f_N0ZyiviGOHw8dOG3WIXM3rnMIStdm3v06ddF7lICNJl77Z6Y_mtMiylGr2EYGE_lU-dhl6pZnJ92MqQhlZpOjEubWZv1Bd95b8A-INOGKto848V3VdJNGPJwFN_DkdeWGF8YMvDWgew1xs3RSeBeHcBqFzLqQkDbgIllvuxl9VON3HBMwPYjMZ0kqzhi02JRzW0rO_gItNZKuHfHb3rNrWctuJQ2Qvup-kEiLHf5Hya_5KCAgn6uOStAioAXszLKlglJqFMNQE39j6ieFhMg&placement=listing_profile&cs_user=unknown&cs_session=88473fea2af4b100b0e7993b2eafa4bedbe4234c"],"id":"AVwczWsPkufWRAb5zLcG"}
{"address":"7085 Alameda Ave","categories":["Other Textile Goods","Textile Manufacturers","Textile Finishers","Wholesale Textiles"],"city":"El Paso","country":"US","dateAdded":"2017-06-27T05:29:45Z","dateUpdated":"2017-09-06T17:24:47Z","keys":["us/tx/elpaso/7085alamedaave/-266489986"],"latitude":"31.7550","longitude":"-106.3926","name":"Midwest Textile Co","phones":["9158811790"],"postalCode":"79915","province":"TX","sourceURLs":["http://www.citysearch.com/profile/620236204/el_paso_tx/midwest_textile_co.html"],"id":"AVzoBujQLD2H7whiXdiR"}
我试图解析它如下:
InputStream resourceInputStream = context.getResourceAsStream("/WEB-INF/jsp/modules/data/20180427-businesses.txt");
String jsonString = IOUtils.toString(resourceInputStream, "UTF-8");
JSONObject json = (JSONObject) JSONSerializer.toJSON( jsonString );
String address = json.getString("address");
但由于数据不是单个JSON字符串,因此无法正常工作。此外,我拥有的数据不在数组中,这使我的事情变得更加困难。我还尝试使用类似的变量创建一个Java类,并尝试将JSON字符串直接映射到该类,它也不适用于我。
InputStream resourceInputStream = context.getResourceAsStream("/WEB-INF/jsp/modules/data/20180427-businesses.txt");
String jsonString = IOUtils.toString(resourceInputStream, "UTF-8");
ObjectMapper mapper = new ObjectMapper();
BusinessDataImportHB records = mapper.readValue(jsonString, BusinessDataImportHB.class);
其中:
public class BusinessDataImportHB
{
private List<BusinessRecord> records;
public List<BusinessRecord> getRecords() {
return records;
}
public void setRecords(List<BusinessRecord> records) {
this.records = records;
}
}
和
public class BusinessRecord {
private String address;
private List<String> categories;
private String city;
private String country;
private Date dateAdded;
private Date dateUpdated;
private List<String> keys;
private String latitude;
private String longitude;
private String name;
private List<String> phones;
private String postalCode;
private String province;
private List<String> websites;
private String id;
我无法更改数据的格式。我可以用来解析数据和获取单个记录的最佳方法是什么?
答案 0 :(得分:3)
如果每个JSON对象都在一行中,您可以逐行读取文件。
try (Stream<String> stream = Files.lines(Paths.get("..."))) {
stream.forEach(line -> {
JSONObject json = (JSONObject) JSONSerializer.toJSON(line);
String address = json.getString("address");
});
}
答案 1 :(得分:0)
如果您的记录是逐行的,并且您有一个大文件,您可能需要考虑逐行阅读和解析。您可以使用try-with-resources块来确保未关闭的资源不会泄漏内存。
String pathname = "/home/william/test.txt"; // your file
try (Scanner sc = new Scanner(new FileInputStream(new File(pathname)))) {
while (sc.hasNextLine()) {
JSONObject json = (JSONObject) JSONSerializer.toJSON(sc.nextLine());
// TODO do something with it
}
} catch (Exception e) {
// TODO
}