Question

我正在Linux服务器上运行Python脚本。它基于scikit learn count vectorizer。 Scikit学习的部分内容是用Cython编写的，因此正在使用C扩展。

只要向量的数量有限，一切正常，但如果数量增加，那么给出分段错误。我认为代码出错的部分就在这里：

def train(bodies, y_train, analyzetype, ngrammax, table, dim, features):
vectorizer = CountVectorizer(input='content', 
                             analyzer=char, 
                             tokenizer=tokenize,
                             ngram_range=(1,4),
                             lowercase=False
                             )
X_train = combine(vectorizer.fit_transform(bodies), 
                  embeddings(bodies, table, dim),
                  features)

我已经使用

将堆栈大小设置为无限制

ulimit -s unlimited

这并没有解决问题。

我还尝试通过显示所有行号来跟踪问题。但不幸的是，我无法让this工作。

Python中的分段错误，同时连接矢量

0 个答案: