1 min readMay 17, 2020
Hi! Thanks a lot for the suggestions, I will definitely look into adding them to the article. As for multicollinearity/VIF, we don’t check for those in machine learning models (especially random forest) as the algorithm inherently chooses the best split (and thus high VIF don’t affect the model accuracy), even more so when using cross-validation. You can read more about it here: https://stats.stackexchange.com/questions/168622/why-is-multicollinearity-not-checked-in-modern-statistics-machine-learning