Commit b36997e1 authored by davidkep's avatar davidkep

split up documentation

parent 407e2b26
......@@ -87,12 +87,13 @@ highlight_language = 'c++'
html_theme_options = {
'fixed_sidebar': True,
'sidebar_includehidden': True,
'sidebar_collapse': True,
'page_width': '80%'
}
html_sidebars = {
'**': [
'about.html',
'localtoc.html',
# 'localtoc.html',
'navigation.html',
'relations.html',
'searchbox.html'
......
:tocdepth: 20
#############################
API Documentation for nsoptim
#############################
......@@ -12,167 +14,8 @@ The argument :math:`x` ise the *coefficient*, and the result of any optimization
The library makes extensive use of templating to avoid dynamic polymorphism and improve runtime performance.
*********
Optimizer
*********
An *Optimizer* is the most important entity in *nsoptim* as this computes the optimum.
Different optimizer can handle different functions and have different methods, but every *Optimizer* implements the basic public interface for optimizers.
General Optimizer interface
===========================
``nsoptim::Optimizer``
^^^^^^^^^^^^^^^^^^^^^^
.. doxygenclass:: nsoptim::Optimizer
.. cpp:type:: nsoptim::Optimizer::LossFunction = T
The loss function type.
.. cpp:type:: nsoptim::Optimizer::PenaltyFunction = U
The penalty function type.
.. cpp:type:: nsoptim::Optimizer::Coefficients = V
The coefficients type.
MM Optimizer
============
The MM (minimization by majorization) algorithm is a meta-algorithm that can optimize a very general class of objective functions.
The algorithm works by successively minimizing a *surrogate* function which majorizes the true objective function at the current point :math:`x`.
A function :math:`h` majorizes function :math:`f` at :math:`x` if
* :math:`h` is greater than :math:`f` everywhere, i.e., :math:`h(x') \geq f(x')` for all :math:`x'` in the domain, and
* the functions coincide at :math:`x`, i.e., :math:`h(x) = f(x)`.
If the loss and/or penalty function implement a method to get the convex surrogate.
Convex Surrogate
^^^^^^^^^^^^^^^^
.. cpp:function:: template<Coefficients> typename LossFunction::ConvexSurrogateType LossFunction::ConvexSurrogate(const Coefficients& where)
Return a convex surrogate loss function which majorizes the true loss function at `where`.
Penalty functions can provide a similar member to return a convex surrogate.
The MM optimizer has several configuration parameters that are set on construction by supplying a :cpp:class:`nsoptim::MMConfiguration` object.
``nsoptim::MMConfiguration``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenstruct:: nsoptim::MMConfiguration
:members:
``nsoptim::MMOptimizer``
^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenclass:: nsoptim::MMOptimizer
:members:
Linearized Alternative Direction Method of Moments (ADMM) Optimizer
===================================================================
* Supported loss functions: :cpp:class:`LsLoss`, :cpp:class:`WeightedLsLoss`
* Supported penalty functions: :cpp:class:`EnPenalty`, :cpp:class:`AdaptiveEnPenalty`
Linearized ADMM works for objective functions that can be written as :math:`l(A x) + p(x)` and solves the problem
.. math::
\operatorname*{arg\,min}_{z, x}\, l(z) + p(x)
\quad\quad
\text{subject to }\; A x - z = 0
Especially if :math:`A` is "wide" (i.e., has more columns than rows), the proximal operator for this problem is usually much quicker to compute than for :math:`\tilde l (x) = l(A x)`.
More information on the properties of the algorithm can be found in :ref:`[1] <ref-attouch-2010>`.
The linearized ADMM algorithm requires a proper implementation of the proximal operator that can handle the given loss and penalty functions.
A proximal operator needs to follow the following interface:
Proximal Operator
^^^^^^^^^^^^^^^^^
.. cpp:class:: ProxOp
.. cpp:function:: void loss(const Lossfunction& loss)
Change the loss to `loss`.
.. cpp:function:: arma::vec operator()(const arma::vec& u, const arma::vec& prev, const double intercept, \
const double lambda, nsoptim::Metrics* metrics)
Get the value of the proximal operator of the function scaled by `lambda`, evaluated at `u`.
The argument `prev` is the previous value returned by the proximal operator and `intercept` is the current
value of the intercept or 0, if the loss does not use an intercept term.
.. cpp:function:: double ComputeIntercept(const arma::vec& fitted)
Compute the intercept term, given the fitted values.
.. cpp:function:: double StepSize(const PenaltyFunction& penalty, const double norm_x)
Get the step size required for the set loss and the given the penalty.
``nsoptim::AdmmLinearConfiguration``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenstruct:: nsoptim::AdmmLinearConfiguration
:members:
``nsoptim::GenericLinearizedAdmmOptimizer``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenclass:: nsoptim::GenericLinearizedAdmmOptimizer
:members:
.. _optimizer-admm-varstep:
Alternative Direction Method of Moments (ADMM) Optimizer with Variable Step-Size
================================================================================
* Supported loss functions: :cpp:class:`LsLoss`, :cpp:class:`WeightedLsLoss`
* Supported penalty functions: :cpp:class:`EnPenalty`, :cpp:class:`AdaptiveEnPenalty`
This implementation operates directly on the objective function :math:`l(x) + p(x)`, but adjusts the step size
according to :ref:`[2] <ref-bartels-2017>`.
``nsoptim::AdmmVarStepConfiguration``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenstruct:: nsoptim::AdmmVarStepConfiguration
:members:
``nsoptim::AdmmVarStepOptimizer``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenclass:: nsoptim::AdmmVarStepOptimizer
:members:
Dual Augmented Lagrangian (DAL)
===============================
The dual augmented lagrangian algorithm according to :ref:`[3] <ref-tomioka-2011>`.
Supports only sparse coefficients.
``nsoptim::DalEnConfiguration``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenstruct:: nsoptim::DalEnConfiguration
:members:
``nsoptim::DalEnOptimizer``
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. doxygenclass:: nsoptim::DalEnOptimizer
:members:
**********
References
**********
.. _ref-attouch-2010:
[1] Attouch, H., Bolte, J., Redont, P., and Soubeyran, A. (2010). Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Ojasiewicz inequality. *Mathematics of Operations Research*, 35(2):438–457.
.. _ref-bartels-2017:
[2] Bartels, S. and Milicevic, M. (2017). *Alternating direction method of multipliers with variable step sizes*. `arXiv:1704.06069 <https://arxiv.org/abs/1704.06069>`_.
.. _ref-tomioka-2011:
.. toctree::
:maxdepth: 4
[3] Tomioka, R., Suzuki, T., and Sugiyama, M. (2011). Super-linear convergence of dual augmented lagrangian algorithm for sparsity regularized estimation. *Journal of Machine Learning Research*, 12:1537–1586.
overview
optimizer
#########
Optimizer
#########
An *Optimizer* is the most important entity in *nsoptim* as this computes the optimum.
Different optimizer can handle different functions and have different methods, but every *Optimizer* implements the basic public interface for optimizers.
***************************
General Optimizer interface
***************************
``nsoptim::Optimizer``
======================
.. doxygenclass:: nsoptim::Optimizer
.. cpp:type:: nsoptim::Optimizer::LossFunction = T
The loss function type.
.. cpp:type:: nsoptim::Optimizer::PenaltyFunction = U
The penalty function type.
.. cpp:type:: nsoptim::Optimizer::Coefficients = V
The coefficients type.
************
MM Optimizer
************
The MM (minimization by majorization) algorithm is a meta-algorithm that can optimize a very general class of objective functions.
The algorithm works by successively minimizing a *surrogate* function which majorizes the true objective function at the current point :math:`x`.
A function :math:`h` majorizes function :math:`f` at :math:`x` if
* :math:`h` is greater than :math:`f` everywhere, i.e., :math:`h(x') \geq f(x')` for all :math:`x'` in the domain, and
* the functions coincide at :math:`x`, i.e., :math:`h(x) = f(x)`.
If the loss and/or penalty function implement a method to get the convex surrogate.
Convex Surrogate
================
.. cpp:function:: template<Coefficients> typename LossFunction::ConvexSurrogateType LossFunction::ConvexSurrogate(const Coefficients& where)
Return a convex surrogate loss function which majorizes the true loss function at `where`.
Penalty functions can provide a similar member to return a convex surrogate.
The MM optimizer has several configuration parameters that are set on construction by supplying a :cpp:class:`nsoptim::MMConfiguration` object.
``nsoptim::MMConfiguration``
============================
.. doxygenstruct:: nsoptim::MMConfiguration
:members:
``nsoptim::MMOptimizer``
========================
.. doxygenclass:: nsoptim::MMOptimizer
:members:
*******************************************************************
Linearized Alternative Direction Method of Moments (ADMM) Optimizer
*******************************************************************
* Supported loss functions: :cpp:class:`LsLoss`, :cpp:class:`WeightedLsLoss`
* Supported penalty functions: :cpp:class:`EnPenalty`, :cpp:class:`AdaptiveEnPenalty`
Linearized ADMM works for objective functions that can be written as :math:`l(A x) + p(x)` and solves the problem
.. math::
\operatorname*{arg\,min}_{z, x}\, l(z) + p(x)
\quad\quad
\text{subject to }\; A x - z = 0
Especially if :math:`A` is "wide" (i.e., has more columns than rows), the proximal operator for this problem is usually much quicker to compute than for :math:`\tilde l (x) = l(A x)`.
More information on the properties of the algorithm can be found in :ref:`[1] <ref-attouch-2010>`.
The linearized ADMM algorithm requires a proper implementation of the proximal operator that can handle the given loss and penalty functions.
A proximal operator needs to follow the following interface:
Proximal Operator
=================
.. cpp:class:: ProxOp
.. cpp:function:: void loss(const Lossfunction& loss)
Change the loss to `loss`.
.. cpp:function:: arma::vec operator()(const arma::vec& u, const arma::vec& prev, const double intercept, \
const double lambda, nsoptim::Metrics* metrics)
Get the value of the proximal operator of the function scaled by `lambda`, evaluated at `u`.
The argument `prev` is the previous value returned by the proximal operator and `intercept` is the current
value of the intercept or 0, if the loss does not use an intercept term.
.. cpp:function:: double ComputeIntercept(const arma::vec& fitted)
Compute the intercept term, given the fitted values.
.. cpp:function:: double StepSize(const PenaltyFunction& penalty, const double norm_x)
Get the step size required for the set loss and the given the penalty.
``nsoptim::AdmmLinearConfiguration``
====================================
.. doxygenstruct:: nsoptim::AdmmLinearConfiguration
:members:
``nsoptim::GenericLinearizedAdmmOptimizer``
===========================================
.. doxygenclass:: nsoptim::GenericLinearizedAdmmOptimizer
:members:
.. _optimizer-admm-varstep:
********************************************************************************
Alternative Direction Method of Moments (ADMM) Optimizer with Variable Step-Size
********************************************************************************
* Supported loss functions: :cpp:class:`LsLoss`, :cpp:class:`WeightedLsLoss`
* Supported penalty functions: :cpp:class:`EnPenalty`, :cpp:class:`AdaptiveEnPenalty`
This implementation operates directly on the objective function :math:`l(x) + p(x)`, but adjusts the step size
according to :ref:`[2] <ref-bartels-2017>`.
``nsoptim::AdmmVarStepConfiguration``
=====================================
.. doxygenstruct:: nsoptim::AdmmVarStepConfiguration
:members:
``nsoptim::AdmmVarStepOptimizer``
=================================
.. doxygenclass:: nsoptim::AdmmVarStepOptimizer
:members:
*******************************
Dual Augmented Lagrangian (DAL)
*******************************
The dual augmented lagrangian algorithm according to :ref:`[3] <ref-tomioka-2011>`.
Supports only sparse coefficients.
``nsoptim::DalEnConfiguration``
===============================
.. doxygenstruct:: nsoptim::DalEnConfiguration
:members:
``nsoptim::DalEnOptimizer``
===========================
.. doxygenclass:: nsoptim::DalEnOptimizer
:members:
**********
References
**********
.. _ref-attouch-2010:
[1] Attouch, H., Bolte, J., Redont, P., and Soubeyran, A. (2010). Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Ojasiewicz inequality. *Mathematics of Operations Research*, 35(2):438–457.
.. _ref-bartels-2017:
[2] Bartels, S. and Milicevic, M. (2017). *Alternating direction method of multipliers with variable step sizes*. `arXiv:1704.06069 <https://arxiv.org/abs/1704.06069>`_.
.. _ref-tomioka-2011:
[3] Tomioka, R., Suzuki, T., and Sugiyama, M. (2011). Super-linear convergence of dual augmented lagrangian algorithm for sparsity regularized estimation. *Journal of Machine Learning Research*, 12:1537–1586.
########
Overview
########
Algorithms and utilities to optimize non-smooth functions, i.e., to solve
.. math::
\operatorname*{arg\,min}_{x} l(x) + p(x)
where :math:`l(x)` is a smooth function called *loss function* and :math:`p(x)` is a non-smooth function, called *penalty function*.
The argument :math:`x` ise the *coefficient*, and the result of any optimization is the *optimum*.
The library makes extensive use of templating to avoid dynamic polymorphism and improve runtime performance.
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment