Skip to content

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Feb 17, 2021

Following on from #1733, we can get further speedups by line searching for the best feature bag fraction for data sets where we only need a fraction of the features per tree. For example, training time on Higgs 1M drops from 2585s to 1742s and we actually get a small improvement in accuracy because our hyperparameter search region is better initialised.

This makes three changes:

  1. Adds a line search for the best initial feature bag fraction to use.
  2. Adds a small linear penalty at most 1% minimum loss to encourage larger down sample factors and smaller feature bag fractions.
  3. Handles better the case we have many features and relatively few training examples.

Some of the variable naming in CBoostedTreeFactory is also now misleading. I've split this change out into a non-functional commit. The functional commit is f9eacc2.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tveasey
Copy link
Contributor Author

tveasey commented Feb 19, 2021

CI on macos is expected to fail at the moment so I'll go ahead and merge.

@tveasey tveasey merged commit ebadbb5 into elastic:master Feb 19, 2021
@tveasey tveasey deleted the line-search-feature-bag-fraction branch February 19, 2021 10:48
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Feb 19, 2021
…on model training (elastic#1746)

Following on from elastic#1733, we can get further speedups by line searching for the best feature bag fraction for data sets
where we only need a fraction of the features per tree. For example, training time on Higgs 1M drops from 2585s to
1742s and we actually get a small improvement in accuracy because our hyperparameter search region is better
initialised.

This makes three changes:
1. Adds a line search for the best initial feature bag fraction to use.
2. Adds a small linear penalty at most 1% minimum loss to encourage larger down sample factors and smaller feature
   bag fractions.
3. Handles better the case we have many features and relatively few training examples.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants