Skip to content

Commit 90e1c8c

Browse files
committed
updated readme
1 parent f605031 commit 90e1c8c

File tree

5 files changed

+86
-38
lines changed

5 files changed

+86
-38
lines changed

.idea/workspace.xml

+28-30
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

models.pyc

0 Bytes
Binary file not shown.

readme.md

+52-4
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
# Robust Least Squares
2-
Fitting a known model robustly to data. The two implementations use
1+
# Robust Least Squares and Outlier Detection
2+
Fitting a known model robustly to data using bayesian iteration. The two implementations use
33
* RANSAC
44
* M-Estimates
55

66
The robust part is implemented, fitting the function is not. Model
7-
fitting is borrowed from the scipy.minimize.
7+
fitting is borrowed from the scipy.minimize. Feel free to use a different model fitting method.
88

99
## Pre-requisites
1010
**numpy** is the only pre-requisite for **robust_lsq.py**.
@@ -27,10 +27,58 @@ such as **scipy.optimize.minimize**. Please see example
2727

2828
## Setup
2929
Please run **test.py** for an example of fitting a straight line
30-
to data robustly.
30+
to data robustly with bayesian sampling.
3131

3232
## How does it work?
33+
The key idea is to determine the samples that fit the model best.
34+
Bayesian updates are used. Bayes rule is given by:
35+
36+
P(data/model) = P(model/data)*P(data)/p(model)
37+
38+
P(data/model) := normalization(P(model/data)*P(data))
39+
40+
Note:
41+
1. P(model) is a constant and can be ignored.
42+
1. In the next iteration P(data/model) becomes P(data).
43+
44+
### ALGORITHM
45+
From an implementation perspective, these are the steps:
46+
1. Build P(data) uniform distribution (or with prior knowledge) over data.
47+
1. Sample n samples from data distribution.
48+
1. Fit model to the selected n samples.
49+
Essentially we are selecting(sampling) the best model given the data.
50+
This is P(model/data) step.
51+
1. Estimate a probability distribution: P(data/model).
52+
1. These are the errors of the data given the selected model.
53+
1. It is wise to use a function such as arctan(1/errors)
54+
so errors are not amplified and create a useless probability distribution.
55+
56+
1. Compute P(data) with update: P(data/model) = normalize(P(data/model)*P(data))
57+
1. Normalize probability distribution.
58+
1. This is the bayesian update step.
59+
60+
1. Go to step 2. and iterate until desired convergence of P(data).
61+
62+
3363
### RANSAC
64+
For a RANSAC flavor of bayesian robust fitting, k samples are selected to fit the model.
65+
#### In classical RANSAC:
66+
1. The minimum number of samples (k) to fit a model is used.
67+
1. k samples are randomly selected p times.
68+
1. The best set of samples that fit all the data is selected.
69+
70+
#### In this bayesian flavor:
71+
1. k samples are selected and fit using least squares (or anything else).
72+
1. Samples are selected from a probability distribution estimated using bayesian updates.
73+
74+
### M-Estimates
75+
This is similar to RANSAC except when fitting the model, all samples are used to
76+
fit the model but are weighed according to their probability distribution.
77+
The probability distribution(weights) is updated using bayesian updates.
78+
79+
### Outlier detection
80+
The probability distribution over the data P(data) provides a way to
81+
perform outlier detection. Simply apply a threshold over this distribution.
3482

3583

3684
## license

robust_lsq.py

+6-4
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
import numpy as np
88

99
def robust_lsq_ransac(model_error_func, model_fit_func, X,
10-
iterations=1000, fit_samples=2, fit_with_best_n=None, priors=None):
10+
iterations=1000, fit_samples=2, fit_with_best_n=None, priors=None,norm_func = np.arctan):
1111

1212
if priors == None:
1313
probabilities = np.ones(len(X))
@@ -25,7 +25,7 @@ def robust_lsq_ransac(model_error_func, model_fit_func, X,
2525
params = model_fit_func(X_subset)
2626
errors = model_error_func(X, params)
2727

28-
current_prob[:] = 1 / np.arctan(1 + errors[:])
28+
current_prob[:] = 1 / norm_func(1 + errors[:])
2929
probabilities *= current_prob
3030
probabilities /= np.sum(probabilities)
3131

@@ -38,7 +38,7 @@ def robust_lsq_ransac(model_error_func, model_fit_func, X,
3838

3939

4040
def robust_lsq_m_estimates(model_error_func, model_fit_func, X,
41-
iterations=1000, priors=None):
41+
iterations=1000, priors=None,norm_func = lambda x: 1/(1 + x**0.1)):
4242
if priors == None:
4343
probabilities = np.ones(len(X))
4444
else:
@@ -54,7 +54,9 @@ def robust_lsq_m_estimates(model_error_func, model_fit_func, X,
5454
if np.sum(errors) < best_errors:
5555
best_param = params
5656
best_errors = np.sum(errors)
57-
current_prob[:] = 1 / (1 + errors[:] ** 0.1)
57+
#current_prob[:] = 1 / (1 + errors[:] ** 0.1)
58+
59+
current_prob[:] = norm_func(errors[:])
5860
# current_prob = 1/np.arctan(1+errors)
5961
probabilities *= current_prob
6062
probabilities /= np.sum(probabilities)

robust_lsq.pyc

227 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)