我正在尝试运行以下查询以从我拥有的.csv文件创建节点和关系:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM 'file:///LoanStats3bEDITED.csv' AS line
//USING PERIODIC COMMIT 1000 makes sure we don't get a memory error
//creating the nodes with their properties
//member node
CREATE (member:Person{member_id:TOINT(line.member_id)})
//Personal information node
CREATE (personalInformation:PersonalInformation{addr_state:line.addr_state})
//recordHistory node
CREATE (recordHistory:RecordHistory{delinq_2yrs:TOFLOAT(line.delinq_2yrs),earliest_cr_line:line.earliest_cr_line,inq_last_6mths:TOFLOAT(line.inq_last_6mths),collections_12_mths_ex_med:TOFLOAT(line.collections_12_mths_ex_med),delinq_amnt:TOFLOAT(line.delinq_amnt),percent_bc_gt_75:TOFLOAT(line.percent_bc_gt_75), pub_rec_bankruptcies:TOFLOAT(line.pub_rec_bankruptcies), tax_liens:TOFLOAT(line.tax_liens)})
//Loan node
CREATE (loan:Loan{funded_amnt:TOFLOAT(line.funded_amnt),term:line.term, int_rate:line.int_rate, installment:TOFLOAT(line.installment),purpose:line.purpose})
//Customer Finances node
CREATE (customerFinances:CustomerFinances{emp_length:line.emp_length,verification_status_joint:line.verification_status_joint,home_ownership:line.home_ownership, annual_inc:TOFLOAT(line.annual_inc), verification_status:line.verification_status,dti:TOFLOAT(line.dti), annual_inc_joint:TOFLOAT(line.annual_inc_joint),dti_joint:TOFLOAT(line.dti_joint)})
//Accounts node
CREATE (accounts:Accounts{revol_util:line.revol_util,tot_cur_bal:TOFLOAT(line.tot_cur_bal)})
//creating the relationships
CREATE UNIQUE (member)-[:FINANCIAL{issue_d:line.issue_d,loan_status:line.loan_status, application_type:line.application_type}]->(loan)
CREATE UNIQUE (customerFinances)<-[:FINANCIAL]-(member)
CREATE UNIQUE (accounts)<-[:FINANCIAL{open_acc:TOINT(line.open_acc),total_acc:TOFLOAT(line.total_acc)}]-(member)
CREATE UNIQUE (personalInformation)<-[:PERSONAL]-(member)
CREATE UNIQUE (recordHistory)<-[:HISTORY]-(member)
但是,我一直收到以下错误:
Unable to rollback transaction
这是什么意思?如何修复我的查询以便它可以成功运行?
我现在收到以下错误:
GC overhead limit exceeded
答案 0 :(得分:0)
我认为你已经忘记了。 解决方案:
使用neo4j-import.batch
拆分你的查询。
制定约束以加快查询速度。
为什么你需要创造独特的?如果你的csv是干净的,你可以使用create,或者使用merge。
我认为如果你不是在浏览器中而是在shell中执行查询,它也会更快。
下载更多ram: - )
答案 1 :(得分:0)
如果您确实需要关系的唯一性,请将1
替换为search_type: 1, the default, causes MATCH to assume that the range is sorted in ascending order and return the largest value less than or equal to search_key.
。
此外,您在=INDEX($E$2:$E$7, MATCH($A$2,$D$2:$D$7, 1))
上重复执行create unique
操作会导致Cypher使用merge
运算符在3个操作中的每一个之前实现全部结果,从而不会陷入无限循环。
这就是为什么定期提交没有生效,导致整个中间结果使用过多内存的原因。
您还可以使用APOC库和MERGE
进行批处理。
FINANCIAL