Optimizing XGboost with XGboost
Typically, hyperparameter tuning is performed by doing a grid search over the space of hyperparameter values.
However, the space of hyper-parameter values can be large.
Training multiple models over the range of hyper-parameter values can be time-consuming even with parallelization.
Randomized algorithms for hyper-parameter search exist and may provide satisfactory results at low computation cost, e.g. RandomizedSearchCV and HalvingRandomSerachCV in scikit-learn. Nevertheless, we can still miss out on a better configuration for the model.
Surrogate models allow optimal hyper-parameter search in a non-exhaustive way, guiding the search toward more promising candidates.
In this notebook, we train an XGBoost model to predict a model score associated with a hyperparameter value. This way, we do not have to run a complete training to calculate the model score.
The general algorithm can be summarized as follows: n configurations are randomly drawn. The surrogate model performs a gain estimation for each one. the most promising configuration is then actually tested on the model to be trained. the gain obtained is reused to enrich the surrogate model.
As a case study, we apply the hyperparameter model to the California housing dataset and compare results.
<Github repo>