参数调优

参数调优#

(1)pycaret中封装的参数调优的工具, search_library包含

检索库	检索算法	备注
scikit-learn	random， grid
scikit-optimize/skopt	bayesian,	https://scikit-optimize.github.io/stable/
tune-sklearn	random(默认),grid, bayesian, hyperopt,bohb	https://github.com/ray-project/tune-sklearn
optuna	random,tpe(默认)	https://optuna.org/

sklearn#

sklearn.model_selection.GridSearchCV()，输入参数

estimator
param_grid
scoring: 可使用的scoring有 https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter
n_jobs并行设置
cv交叉验证

输出结果

best_estimator_：最好估计的estimator
best_params_

sklearn.model_selection.RandomizedSearchCV() 类似

skopt#

tune-sklearn#

https://github.com/ray-project/tune-sklearn 调用方式上和sklearn一致

optuna#

示例

optuna中有两个概念，一个是study即优化任务，trail根据生成的一次参数组合执行任务。

缺点：使用方式上和sklearn的有些差别。参数需要改成trail.sugget_xxx()的形式。

import optuna

def objective(trial):
    iris = sklearn.datasets.load_iris()

    n_estimators = trial.suggest_int('n_estimators', 2, 20)
    max_depth = int(trial.suggest_float('max_depth', 1, 32, log=True))

    clf = sklearn.ensemble.RandomForestClassifier(
        n_estimators=n_estimators, max_depth=max_depth)

    return sklearn.model_selection.cross_val_score(
        clf, iris.data, iris.target, n_jobs=-1, cv=3).mean()

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

trial = study.best_trial

print('Accuracy: {}'.format(trial.value))
print("Best hyperparameters: {}".format(trial.params))

其他