.. C-Munipack - User's manual

   Copyright 2012 David Motl
   
   Permission is granted to copy, distribute and/or modify this document under the 
   terms of the GNU Free Documentation License, Version 1.2 or any later version published 
   by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and 
   no Back-Cover Texts.

   $Id: robust_mean.rst,v 1.2 2015/07/12 07:44:57 dmotl Exp $

.. index::
   pair: robust; mean

.. _robust-mean: 

Robust mean
===========

The computation of robust mean is based on a re-descending *\Psi* type 
M-estimator, that uses Hampel's influence function, see [hampel86]_ and [huber81]_:

.. math::
   :label: hampel

   \Psi(x) =
	\begin{cases}
	x 													& \textrm{if } 0 \leq |x| \leq a \\
	a \operatorname{sign}(x)							& \textrm{if } a \leq |x| \leq b \\
	\frac{a(c-|x|)}{(c-b)}\operatorname{sign}(x)		& \textrm{if } b \leq |x| \leq c \\
	0													& \textrm{if } c \leq |x| \\
	\end{cases}

where *a = 1.7*, *b = 3.4* and *c = 8.5*. The function :math:`\Psi(x)` and its first derivative
:math:`\Psi'(x)` is plotted on Figure :ref:`fig:hampel`. 
                         
.. _fig:hampel:
                     
.. figure:: images/hampel.png
   :align: center
   :alt: Hampel's influence function

   Hampel's influence function. (a) influence function :math:`\Psi`; (b) first derivative :math:`\Psi'`

The starting point for iterations is determined by median as an estimate of location
and median of absolute deviations (MAD) for an estimation of scale *s*. 

.. math::
   :label: MAD

   s = MAD(X)/0.6745


The optimum solution is found by means of the Newton-Raphson optimizing algorithm. 
In iteration $j$ a new estimation of location :math:`\mu_{j}` is determined using the following 
formula:

.. math::
   :label: mu_j

   \mu_{j} = \mu_{j-1} + s\,\frac{\sum \Psi(r_i)}{\sum \Psi'(r_i)}

where :math:`r_i` is a residual

.. math::
   :label: r_i
   
   r_i = \frac{x_i - \mu}{s}

If the input vector has normal distribution, the estimation of variance is:

.. math::
   :label: sigma_2

   \sigma^2 = \frac{n}{n-1}\,n\,\frac{\sum\Psi^2(r_i)}{(\sum\Psi'(r_i))^2}

where *n* is the number of samples. The term :math:`\frac{n}{n-1}` is Bessel's correction.

The iterations are stopped when one of the following conditions is satisfied:

* The number of iterations exceed 100 or
* the absolute difference :math:`|\mu_{j} - \mu_{j-1}|` is less than :math:`10^{-4}\sigma`, where :math:`\sigma` is 
  an square root of variance :math:`\sigma^2` computed at each step or
* the absolute difference :math:`|\mu_{j} - \mu_{j-1}|` is less than :math:`10^{-7}`.
