1
- # Robust Least Squares
2
- Fitting a known model robustly to data. The two implementations use
1
+ # Robust Least Squares and Outlier Detection
2
+ Fitting a known model robustly to data using bayesian iteration . The two implementations use
3
3
* RANSAC
4
4
* M-Estimates
5
5
6
6
The robust part is implemented, fitting the function is not. Model
7
- fitting is borrowed from the scipy.minimize.
7
+ fitting is borrowed from the scipy.minimize. Feel free to use a different model fitting method.
8
8
9
9
## Pre-requisites
10
10
** numpy** is the only pre-requisite for ** robust_lsq.py** .
@@ -27,10 +27,58 @@ such as **scipy.optimize.minimize**. Please see example
27
27
28
28
## Setup
29
29
Please run ** test.py** for an example of fitting a straight line
30
- to data robustly.
30
+ to data robustly with bayesian sampling .
31
31
32
32
## How does it work?
33
+ The key idea is to determine the samples that fit the model best.
34
+ Bayesian updates are used. Bayes rule is given by:
35
+
36
+ P(data/model) = P(model/data)* P(data)/p(model)
37
+
38
+ P(data/model) := normalization(P(model/data)* P(data))
39
+
40
+ Note:
41
+ 1 . P(model) is a constant and can be ignored.
42
+ 1 . In the next iteration P(data/model) becomes P(data).
43
+
44
+ ### ALGORITHM
45
+ From an implementation perspective, these are the steps:
46
+ 1 . Build P(data) uniform distribution (or with prior knowledge) over data.
47
+ 1 . Sample n samples from data distribution.
48
+ 1 . Fit model to the selected n samples.
49
+ Essentially we are selecting(sampling) the best model given the data.
50
+ This is P(model/data) step.
51
+ 1 . Estimate a probability distribution: P(data/model).
52
+ 1 . These are the errors of the data given the selected model.
53
+ 1 . It is wise to use a function such as arctan(1/errors)
54
+ so errors are not amplified and create a useless probability distribution.
55
+
56
+ 1 . Compute P(data) with update: P(data/model) = normalize(P(data/model)* P(data))
57
+ 1 . Normalize probability distribution.
58
+ 1 . This is the bayesian update step.
59
+
60
+ 1 . Go to step 2. and iterate until desired convergence of P(data).
61
+
62
+
33
63
### RANSAC
64
+ For a RANSAC flavor of bayesian robust fitting, k samples are selected to fit the model.
65
+ #### In classical RANSAC:
66
+ 1 . The minimum number of samples (k) to fit a model is used.
67
+ 1 . k samples are randomly selected p times.
68
+ 1 . The best set of samples that fit all the data is selected.
69
+
70
+ #### In this bayesian flavor:
71
+ 1 . k samples are selected and fit using least squares (or anything else).
72
+ 1 . Samples are selected from a probability distribution estimated using bayesian updates.
73
+
74
+ ### M-Estimates
75
+ This is similar to RANSAC except when fitting the model, all samples are used to
76
+ fit the model but are weighed according to their probability distribution.
77
+ The probability distribution(weights) is updated using bayesian updates.
78
+
79
+ ### Outlier detection
80
+ The probability distribution over the data P(data) provides a way to
81
+ perform outlier detection. Simply apply a threshold over this distribution.
34
82
35
83
36
84
## license
0 commit comments