Regression
Introduction
The idea of looking at a lot of data samples and trying to predict the dependent variable in a continuous numeric domain is called regression in statistical terms.
Assumptions
In order to perform regression on any dataset, it must satisfy the following assumptions:
- Normality: The erros are assumed to be normally distributed
- Independent: The errors must be independent of each other
- Mean and Variance: They must have zero mean and constant variance (this property of having a constant variance is also called homoscedasticity
These assumptions are usually verified using Q-Q Plots, S-W test etc.
This chapter offers introduction to various kind of regressions and their use cases.
- Linear Regression
- Logistic Regression
- Polynomial Regression
- Stepwise Regression
- Ridge Regression
- Lasso Regression
- ElasticNet Regression
- Support Vector Regression
- Decision Tree Regression
- Random Forest Regression
- Gradient Boosting & AdaBoost
- XGBoost Regression
- Bayesian Linear Regression
- Generalized Linear Model (GLM)
- Poisson Regression
- Negative Binomial Regression
- Cox Regression
- Multivariate Adaptive Regression Splines (MARS)
- Robust Regression
- Principal Components Regression (PCR)
- Partial Least Squares (PLS) Regression
- Tweedie Regression
- Quantile Regression
- Neural Network Regression
- SVR (Support Vector Regression)
- Stochastic Gradient Descent Regression
- k-Nearest Neighbors Regression
- LightGBM Regression
- CatBoost Regression