使用python日志记录将XGBoost的xgb.train输出保存为日志文件

时间:2017-10-07 12:12:40

标签: python logging xgboost

我尝试将xgb.train的XGBoost的输出保存为logging的日志文件,但我无法记录输出。我该如何录制?我试图引用现有的Stackoverflow问题,但这是不可能的。我希望你用一个具体的样本来展示它。

import sys
import logging

# ---------------------------------------------- #
# Some logging settings
# ---------------------------------------------- #

import xgboost as xgb

import numpy as np
from sklearn.model_selection import KFold
from sklearn.datasets import load_digits

rng = np.random.RandomState(31337)

print("Zeros and Ones from the Digits dataset: binary classification")
digits = load_digits(2)
y = digits['target']
X = digits['data']
kf = KFold(n_splits=2, shuffle=True, random_state=rng)
for train_index, test_index in kf.split(X):

    param = {'max_depth':2, 'eta':0.3, 'silent':1, 'objective':'binary:logistic' }

    dtrain = xgb.DMatrix(X[train_index], y[train_index])
    dtest = xgb.DMatrix(X[test_index], y[test_index])

    # specify validations set to watch performance
    watchlist  = [(dtest,'eval'), (dtrain,'train')]
    num_round = 2
    bst = xgb.train(param, dtrain, num_round, watchlist)

# I want to record this output.
# Zeros and Ones from the Digits dataset: binary classification
# [0]   eval-error:0.011111 train-error:0.011111
# [1]   eval-error:0.011111 train-error:0.005556
# [0]   eval-error:0.016667 train-error:0.005556
# [1]   eval-error:0.005556 train-error:0

2 个答案:

答案 0 :(得分:2)

xgboost直接将其日志打印到标准输出中,您无法更改行为。 但是callbacks的{​​{1}}参数能够将结果记录为与内部打印相同的时间。

以下代码是使用回调将xgboost日志记录到logger中的示例。 xgb.train返回一个从xgboost internal调用的回调函数,你可以将回调函数添加到log_evaluation()

callbacks

答案 1 :(得分:0)

import sys
%logstart -o "test.log"
sys.stdout = open('test.log', 'a')

import xgboost as xgb

import numpy as np
from sklearn.model_selection import KFold
from sklearn.datasets import load_digits

rng = np.random.RandomState(31337)

print("Zeros and Ones from the Digits dataset: binary classification")
digits = load_digits(2)
y = digits['target']
X = digits['data']
kf = KFold(n_splits=2, shuffle=True, random_state=rng)
for train_index, test_index in kf.split(X):

    param = {'max_depth':2, 'eta':0.3, 'silent':1, 'objective':'binary:logistic' }

    dtrain = xgb.DMatrix(X[train_index], y[train_index])
    dtest = xgb.DMatrix(X[test_index], y[test_index])

    # specify validations set to watch performance
    watchlist  = [(dtest,'eval'), (dtrain,'train')]
    num_round = 2
    bst = xgb.train(param, dtrain, num_round, watchlist)

这将开始保存文件test.log中的所有内容。输出以及输入。