python - Scikit-learn Random Forest out of bag sample -
i trying access out of bag samples associated each tree in randomforestclassifier no luck. found other informations gini score , split feature each node, looking there : https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx
does know if possible out of bag sample related tree ? if not maybe possible 'in bag' sample (subset of dataset used specific tree) , compute oob using original data set ?
thanks in advance
you can figure out source code, how private _set_oob_score
method of random forest works. every tree estimator in scikit-learn has it's own seed pseudo random number generator, it's stored inside estimator.random_state
field.
during fit procedure every estimator learns on subset of training set, indices subset of training set generated prng , seed estimator.random_state
.
this should work:
from sklearn.ensemble.forest import _generate_unsampled_indices # x here - training set of examples n_samples = x.shape[0] tree in rf.estimators_: # here @ each iteration obtain out of bag samples every tree. unsampled_indices = _generate_unsampled_indices( tree.random_state, n_samples)
Comments
Post a Comment