Neo4j:使用create unique语句查询exectuion后出现Java堆空间错误

时间:2016-04-17 12:52:01

标签: java neo4j cypher heap

我试图在一些neo4j数据库上测试一些查询,数据量不同。如果我在少量数据上测试查询,一切正常并且执行时间很短,但是当我开始在具有2794个节点和94863关系的数据库上执行查询时,在Neo4j API中需要很长时间才能得到以下错误: Java堆空间Neo.DatabaseError.General.UnknownFailure enter image description here 第一个查询:

    MATCH (u1:User)-[r1:Rated]->(m:Movie)<-[r2:Rated]-(u2:User)
WITH 1.0*SUM(r1.Rate)/count(r1) as pX, 
1.0*SUM(r2.Rate)/count(r2) as pY, u1, u2
MATCH (u1:User)-[r1:Rated]->(m:Movie)<-[r2:Rated]-(u2:User)
WITH SUM((r1.Rate-pX)*(r2.Rate-pY)) as pomProm,
SQRT(SUM((r1.Rate-pX)^2)) as sumX, 
SQRT(SUM((r2.Rate-pY)^2)) as sumY, pX,pY,u1,u2
CREATE UNIQUE (u1)-[s:SIMILARITY1]-(u2)
SET s.value = pomProm / (sumX * sumY)

第二个查询

    MATCH (u1:User)-[r1:Rated]->(m:Movie)<-[r2:Rated]-(u2:User)
WITH SUM(r1.Rate * r2.Rate) AS pomProm,
SQRT(REDUCE(r1Pom = 0, i IN COLLECT(r1.Rate) | r1Pom + toInt(i^2))) AS r1V,
SQRT(REDUCE(r2Pom = 0, j IN COLLECT(r2.Rate) | r2Pom + toInt(j^2))) AS r2V,
u1, u2
CREATE UNIQUE (u1)-[s:SIMILARITY2]-(u2)
SET s.value = pomProm / (r1V * r2V)

数据库中的数据由以下Java代码生成:

public enum Labels implements Label {
    Movie, User
}

public enum RelationshipLabels implements RelationshipType {
    Rated
}

public static void main(String[] args) throws IOException, BiffException {
    Workbook workbook = Workbook.getWorkbook(new File("C:/Users/User/Desktop/DP/dvdlist.xls"));
    Workbook names = Workbook.getWorkbook(new File("C:/Users/User/Desktop/DP/names.xls"));
    String path = new String("C:/Users/User/Documents/Neo4j/test7.graphDatabase");
    GraphDatabaseFactory dbFactory = new GraphDatabaseFactory();
    GraphDatabaseService db = dbFactory.newEmbeddedDatabase(path);
    int countMovies = 0;
    int numberOfSheets = workbook.getNumberOfSheets();
    IndexDefinition indexDefinition;
    try (Transaction tx = db.beginTx()) {
        Schema schema = db.schema();
        indexDefinition = schema.indexFor(DynamicLabel.label(Labels.Movie.toString()))
                .on("Name")
                .create();
        tx.success();
    }
    try (Transaction tx = db.beginTx()) {
        Schema schema = db.schema();
        indexDefinition = schema.indexFor(DynamicLabel.label(Labels.Movie.toString()))
                .on("Genre")
                .create();
        tx.success();
    }
    try (Transaction tx = db.beginTx()) {
        Schema schema = db.schema();
        indexDefinition = schema.indexFor(DynamicLabel.label(Labels.User.toString()))
                .on("Name")
                .create();
        tx.success();
    }
    try (Transaction tx = db.beginTx()) {

        for (int i = 0; i < numberOfSheets; i++) {
            Sheet sheet = workbook.getSheet(i);
            int numberOfRows = 6000;//sheet.getRows();
            for (int j = 1; j < numberOfRows; j++) {
                Cell cell1 = sheet.getCell(0, j);
                Cell cell2 = sheet.getCell(9, j);
                Node movie = db.createNode(Labels.Movie);
                movie.setProperty("Name", cell1.getContents());
                movie.setProperty("Genre", cell2.getContents());

                countMovies++;

            }

        }
        tx.success();
    } catch (Exception e) {
        System.out.println("Something goes wrong!");
    }

    Random random = new Random();
    int countUsers = 0;
    Sheet sheetNames = names.getSheet(0);
    Cell cell;
    Node user;

    int numberOfUsers = 1500;//sheetNames.getRows();
    for (int i = 0; i < numberOfUsers; i++) {
        cell = sheetNames.getCell(0, i);
        try (Transaction tx = db.beginTx()) {
            user = db.createNode(Labels.User);
            user.setProperty("Name", cell.getContents());
            List<Integer> listForUser = new ArrayList<>();

            for (int x = 0; x < 1000; x++) {
                int j = random.nextInt(countMovies);
                if (!listForUser.isEmpty()) {
                    if (!listForUser.contains(j)) {
                        listForUser.add(j);
                    }
                } else {
                    listForUser.add(j);
                }
            }
            for (int j = 0; j < listForUser.size(); j++) {
                Node movies = db.getNodeById(listForUser.get(j));
                int rate = 0;

                rate = random.nextInt(10) + 1;

                Relationship relationship = user.createRelationshipTo(movies, RelationshipLabels.Rated);
                relationship.setProperty("Rate", rate);

            }
            System.out.println("Number of user: " + countUsers);
            tx.success();
        } catch (Exception e) {
            System.out.println("Something goes wrong!");
        }
        countUsers++;
    }

    workbook.close();
}

}

有谁知道,如何解决这个问题?或者有一些解决方法,如何从具有大量数据的数据库中获取查询结果?或者一些查询或设置改进?我真的很感激。

2 个答案:

答案 0 :(得分:2)

我有一个类似的问题(在4.1版中),可以在import React, { useEffect, useState, useCallback } from 'react' import { CheckBox } from 'react-native-elements' import { Alert } from 'react-native' const Choose = (props) => { const [today, setToday] = useState(false) const [tommorow, setTommorow] = useState(false) useEffect(() => { props.navigation.setParams({ handleSubmit: handleSubmit }) }, [handleSubmit]) console.log(`today is ${today}`) // this works and is changed by the check box const handleSubmit = useCallback(() => { if (today == true){ console.log(`today is ${today}`) // today from outise this function is never true Alert.alert('You selected today') }else if (tommorow == true){ Alert.alert('You selected tommorow') } }, [today, tommorow]) return ( <View> <CheckBox checked={world} onPress={() => setToday(!today)} title='Today' /> <CheckBox onPress={() => setTommorow(!tommorow)} title='Tommorow' /> </View> ) } export default ChooseToAdd Choose.navigationOptions = () => { const submit = navigationData.navigation.getParam('handleSubmit') return { headerRight: () => <TouchableOpacity onPress={submit}> <Text>Submit</Text> </TouchableOpacity> } } 中找到这些属性,或者选择活动数据库->管理->设置并增加:

conf/neo4j.conf

有关性能的更多详细信息,请参见documentation

答案 1 :(得分:0)

您可能需要配置Neo4j可用的内存量。您可以通过编辑conf/neo4j-wrapper.conf

来配置Neo4j服务器堆大小
wrapper.java.maxmemory=NUMBER_OF_MB_HERE

有关详细信息,请参阅this page

但是,查看您的查询(正在执行图全局全对操作),您可能需要考虑批量执行它们。例如:

// Find users with overlapping movie ratings
MATCH (u1:User)-[:RATED]->(:Movie)<-[:RATED]-(u2:User)
// only for users whose similarity has not yet been calculated
WHERE NOT exists((u1)-[:SIMILARITY]-(u2))
// consider only up to 50 pairs of users
WITH u1, u2 LIMIT 50
// compute similarity metric and set SIMILARITY relationship with coef
...

然后重复执行此查询,直到您为具有重叠影片评级的所有用户计算相似性指标。