请查看以下代码
public void createHash() throws IOException
{
System.out.println("Hash Creation Started");
StringBuffer hashIndex = new StringBuffer("");
AmazonS3 s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider());
Region usWest2 = Region.getRegion(Regions.US_EAST_1);
s3.setRegion(usWest2);
strBuffer = new StringBuffer("");
try
{
//List all the Buckets
List<Bucket>buckets = s3.listBuckets();
for(int i=0;i<buckets.size();i++)
{
System.out.println("- "+(buckets.get(i)).getName());
}
//Downloading the Object
System.out.println("Downloading Object");
S3Object s3Object = s3.getObject(new GetObjectRequest("JsonBucket", "Articles_4.json"));
System.out.println("Content-Type: " + s3Object.getObjectMetadata().getContentType());
//Read the JSON File
BufferedReader reader = new BufferedReader(new InputStreamReader(s3Object.getObjectContent()));
while (true) {
String line = reader.readLine();
if (line == null) break;
// System.out.println(" " + line);
strBuffer.append(line);
}
JSONTokener jTokener = new JSONTokener(strBuffer.toString());
jsonArray = new JSONArray(jTokener);
System.out.println("Json array length: "+jsonArray.length());
for(int i=0;i<jsonArray.length();i++)
{
JSONObject jsonObject1 = jsonArray.getJSONObject(i);
//Add Title and Body Together to the list
String titleAndBodyContainer = jsonObject1.getString("title")+" "+jsonObject1.getString("body");
//Remove full stops and commas
titleAndBodyContainer = titleAndBodyContainer.replaceAll("\\.(?=\\s|$)", " ");
titleAndBodyContainer = titleAndBodyContainer.replaceAll(",", " ");
titleAndBodyContainer = titleAndBodyContainer.toLowerCase();
//Create a word list without duplicated words
StringBuilder result = new StringBuilder();
HashSet<String> set = new HashSet<String>();
for(String s : titleAndBodyContainer.split(" ")) {
if (!set.contains(s)) {
result.append(s);
result.append(" ");
set.add(s);
}
}
//System.out.println(result.toString());
//Re-Arranging everything into Alphabetic Order
String testString = "acarus acarpous accession absently missy duckweed settling";
String testHash = "058 057 05@ 03o dwr 6ug i^&";
String[]finalWordHolder = (result.toString()).split(" ");
Arrays.sort(finalWordHolder);
//Navigate through text and create the Hash
for(int arrayCount=0;arrayCount<finalWordHolder.length;arrayCount++)
{
Iterator iter = completedWordMap.entrySet().iterator();
while(iter.hasNext())
{
Map.Entry mEntry = (Map.Entry)iter.next();
String key = (String)mEntry.getKey();
String value = (String)mEntry.getValue();
if(finalWordHolder[arrayCount].equals(value))
{
hashIndex.append(key); //Adding Hash Keys
//hashIndex.append(" ");
}
}
}
//System.out.println(hashIndex.toString().trim());
jsonObject1.put("hash_index", hashIndex.toString().trim()); //Add the Hash to the JSON Object
jsonObject1.put("primary_key", i); //Create the primary key
jsonObjectHolder.add(jsonObject1); //Add the JSON Object to the JSON collection
System.out.println("JSON Number: "+i);
}
System.out.println("Hash Creation Completed");
}
catch(Exception e)
{
e.printStackTrace();
}
}
我无法在本地计算机或Amazon EC2中运行此代码,我收到以下错误
我很担心因为这个“测试”在6mb JSON文件上运行,而原始文件将是太字节。我在EC2中使用Linux实例,但我不是Linux人。我怎么能摆脱这个?
答案 0 :(得分:5)
您在循环之外声明hashIndex
StringBuffer hashIndex = new StringBuffer("");
...
for(int i=0;i<jsonArray.length();i++) {
hashIndex.append(...);
这意味着StringBuffer在迭代桶时会越来越大,直到它最终爆炸!
我认为您打算在循环中声明hashIndex
。
答案 1 :(得分:4)
构造StringBuffer
对象以将其传递到JSONTokener
内是一个非常糟糕的主意。这个类直接来自Reader
或InputStream
的构造函数,所以你的代码应该是这样的:
JSONTokener jTokener = new JSONTokener(new BufferedReader(new InputStreamReader(s3Object.getObjectContent())));
答案 2 :(得分:0)
你的java用尽了堆内存。在32位系统上,您可以将堆内存增加到4GB。如果您使用的是64位系统,则可以更高。如果你要求在32位系统上超过4gb,你将从java获得无效值,它将退出。
以下是使用cmd命令在64位系统上将内存堆设置为6gb的方法:
java -Xmx6144M -d64