Robust Linear Models¶
Robust linear models with support for the M-estimators listed under Norms.
See Module Reference for commands and arguments.
Examples¶
# Load modules and data
In [1]: import statsmodels.api as sm
ImportErrorTraceback (most recent call last)
<ipython-input-1-6030a6549dc0> in <module>()
----> 1 import statsmodels.api as sm
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/api.py in <module>()
5 from . import regression
6 from .regression.linear_model import OLS, GLS, WLS, GLSAR
----> 7 from .regression.recursive_ls import RecursiveLS
8 from .regression.quantile_regression import QuantReg
9 from .regression.mixed_linear_model import MixedLM
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/regression/recursive_ls.py in <module>()
14 from statsmodels.regression.linear_model import OLS
15 from statsmodels.tools.data import _is_using_pandas
---> 16 from statsmodels.tsa.statespace.mlemodel import (
17 MLEModel, MLEResults, MLEResultsWrapper)
18 from statsmodels.tools.tools import Bunch
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/tsa/statespace/mlemodel.py in <module>()
12 from scipy.stats import norm
13
---> 14 from .kalman_smoother import KalmanSmoother, SmootherResults
15 from .kalman_filter import (KalmanFilter, FilterResults, INVERT_UNIVARIATE,
16 SOLVE_LU)
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/tsa/statespace/kalman_smoother.py in <module>()
12 import numpy as np
13
---> 14 from statsmodels.tsa.statespace.representation import OptionWrapper
15 from statsmodels.tsa.statespace.kalman_filter import (KalmanFilter,
16 FilterResults)
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/tsa/statespace/representation.py in <module>()
8
9 import numpy as np
---> 10 from .tools import (
11 find_best_blas_type, prefix_dtype_map, prefix_statespace_map,
12 validate_matrix_shape, validate_vector_shape
/builddir/build/BUILD/statsmodels-0.8.0/statsmodels/tsa/statespace/tools.py in <module>()
10 from scipy.linalg import solve_sylvester
11 from statsmodels.tools.data import _is_using_pandas
---> 12 from . import _statespace
13
14 has_find_best_blas_type = True
ImportError: cannot import name _statespace
In [2]: data = sm.datasets.stackloss.load()
NameErrorTraceback (most recent call last)
<ipython-input-2-ce15c2d6cff3> in <module>()
----> 1 data = sm.datasets.stackloss.load()
NameError: name 'sm' is not defined
In [3]: data.exog = sm.add_constant(data.exog)
NameErrorTraceback (most recent call last)
<ipython-input-3-528ff98c77bc> in <module>()
----> 1 data.exog = sm.add_constant(data.exog)
NameError: name 'sm' is not defined
# Fit model and print summary
In [4]: rlm_model = sm.RLM(data.endog, data.exog, M=sm.robust.norms.HuberT())
NameErrorTraceback (most recent call last)
<ipython-input-4-9a0676ae2e1a> in <module>()
----> 1 rlm_model = sm.RLM(data.endog, data.exog, M=sm.robust.norms.HuberT())
NameError: name 'sm' is not defined
In [5]: rlm_results = rlm_model.fit()
NameErrorTraceback (most recent call last)
<ipython-input-5-faa1c6d417e5> in <module>()
----> 1 rlm_results = rlm_model.fit()
NameError: name 'rlm_model' is not defined
In [6]: print(rlm_results.params)
NameErrorTraceback (most recent call last)
<ipython-input-6-e6f861521b3a> in <module>()
----> 1 print(rlm_results.params)
NameError: name 'rlm_results' is not defined
Detailed examples can be found here:
Technical Documentation¶
References¶
- PJ Huber. ‘Robust Statistics’ John Wiley and Sons, Inc., New York. 1981.
- PJ Huber. 1973, ‘The 1972 Wald Memorial Lectures: Robust Regression: Asymptotics, Conjectures, and Monte Carlo.’ The Annals of Statistics, 1.5, 799-821.
- R Venables, B Ripley. ‘Modern Applied Statistics in S’ Springer, New York,
Module Reference¶
Model Results¶
RLMResults (model, params, …) |
Class to contain RLM results |
Norms¶
AndrewWave ([a]) |
Andrew’s wave for M estimation. |
Hampel ([a, b, c]) |
Hampel function for M-estimation. |
HuberT ([t]) |
Huber’s T for M estimation. |
LeastSquares |
Least squares rho for M-estimation and its derived functions. |
RamsayE ([a]) |
Ramsay’s Ea for M estimation. |
RobustNorm |
The parent class for the norms used for robust regression. |
TrimmedMean ([c]) |
Trimmed mean function for M-estimation. |
TukeyBiweight ([c]) |
Tukey’s biweight function for M-estimation. |
estimate_location (a, scale[, norm, axis, …]) |
M-estimator of location using self.norm and a current estimator of scale. |
Scale¶
Huber ([c, tol, maxiter, norm]) |
Huber’s proposal 2 for estimating location and scale jointly. |
HuberScale ([d, tol, maxiter]) |
Huber’s scaling for fitting robust linear models. |
mad (a[, c, axis, center]) |
The Median Absolute Deviation along given axis of an array |
hubers_scale |
Huber’s scaling for fitting robust linear models. |
stand_mad (a[, c, axis]) |