我正在从纸上运行重新处理代码。雅虎数据集为699640226行。我运行代码,错误为
> 2nd pass training: 359000000 2nd pass training: 360000000 2nd pass
> training: 361000000 Traceback (most recent call last): File
> "/usit/abel/u1/cnphuong/.local/opt/nomad/Scripts/convert.py", line 80,
> in <module>
> train_values.append(float(tokens[2])) MemoryError```
> 2. I run on server with 32 and 60GB ram but there are the same error.
>
> ```python
> # now parse the data train_user_indices = list() train_item_indices = list() train_values = list() for index, line in
> enumerate(open(train_filename)):
> if index % 1000000 == 0:
> print "2nd pass training:", index
> tokens = line.split(" ")
> train_user_indices.append(user_indexer[tokens[0]])
> train_item_indices.append(item_indexer[tokens[1]])
> train_values.append(float(tokens[2]))
请告诉我最好的方法来将所有数据添加到列表中,因为作者可以使用该文件(〜11GB和699640226)运行