### Merge branch 'feature/documentation'

* feature/documentation:
clean up index and overview
split up documentation
Update theme
fix typo in requirements file for readthedocs
add built documentation
parents 886c63bb 0c6924ff
inst/doc/Doxyfile 0 → 100644
This source diff could not be displayed because it is too large. You can view the blob instead.
inst/doc/conf.py 0 → 100644
 :tocdepth: 20 ############################# API Documentation for nsoptim ############################# nsoptim is a C++ template library for non-smooth optimization, building upon the armadillo _ library for linear algebra. .. toctree:: :maxdepth: 4 overview optimizer
 ######### Optimizer ######### An *Optimizer* is the most important entity in *nsoptim* as this computes the optimum. Different optimizer can handle different functions and have different methods, but every *Optimizer* implements the basic public interface for optimizers. *************************** General Optimizer interface *************************** nsoptim::Optimizer ====================== .. doxygenclass:: nsoptim::Optimizer .. cpp:type:: nsoptim::Optimizer::LossFunction = T The loss function type. .. cpp:type:: nsoptim::Optimizer::PenaltyFunction = U The penalty function type. .. cpp:type:: nsoptim::Optimizer::Coefficients = V The coefficients type. ************ MM Optimizer ************ The MM (minimization by majorization) algorithm is a meta-algorithm that can optimize a very general class of objective functions. The algorithm works by successively minimizing a *surrogate* function which majorizes the true objective function at the current point :math:x. A function :math:h majorizes function :math:f at :math:x if * :math:h is greater than :math:f everywhere, i.e., :math:h(x') \geq f(x') for all :math:x' in the domain, and * the functions coincide at :math:x, i.e., :math:h(x) = f(x). If the loss and/or penalty function implement a method to get the convex surrogate. Convex Surrogate ================ .. cpp:function:: template typename LossFunction::ConvexSurrogateType LossFunction::ConvexSurrogate(const Coefficients& where) Return a convex surrogate loss function which majorizes the true loss function at where. Penalty functions can provide a similar member to return a convex surrogate. The MM optimizer has several configuration parameters that are set on construction by supplying a :cpp:class:nsoptim::MMConfiguration object. nsoptim::MMConfiguration ============================ .. doxygenstruct:: nsoptim::MMConfiguration :members: nsoptim::MMOptimizer ======================== .. doxygenclass:: nsoptim::MMOptimizer :members: ******************************************************************* Linearized Alternative Direction Method of Moments (ADMM) Optimizer ******************************************************************* * Supported loss functions: :cpp:class:LsLoss, :cpp:class:WeightedLsLoss * Supported penalty functions: :cpp:class:EnPenalty, :cpp:class:AdaptiveEnPenalty Linearized ADMM works for objective functions that can be written as :math:l(A x) + p(x) and solves the problem .. math:: \operatorname*{arg\,min}_{z, x}\, l(z) + p(x) \quad\quad \text{subject to }\; A x - z = 0 Especially if :math:A is "wide" (i.e., has more columns than rows), the proximal operator for this problem is usually much quicker to compute than for :math:\tilde l (x) = l(A x). More information on the properties of the algorithm can be found in :ref: . The linearized ADMM algorithm requires a proper implementation of the proximal operator that can handle the given loss and penalty functions. A proximal operator needs to follow the following interface: Proximal Operator ================= .. cpp:class:: ProxOp .. cpp:function:: void loss(const Lossfunction& loss) Change the loss to loss. .. cpp:function:: arma::vec operator()(const arma::vec& u, const arma::vec& prev, const double intercept, \ const double lambda, nsoptim::Metrics* metrics) Get the value of the proximal operator of the function scaled by lambda, evaluated at u. The argument prev is the previous value returned by the proximal operator and intercept is the current value of the intercept or 0, if the loss does not use an intercept term. .. cpp:function:: double ComputeIntercept(const arma::vec& fitted) Compute the intercept term, given the fitted values. .. cpp:function:: double StepSize(const PenaltyFunction& penalty, const double norm_x) Get the step size required for the set loss and the given the penalty. nsoptim::AdmmLinearConfiguration ==================================== .. doxygenstruct:: nsoptim::AdmmLinearConfiguration :members: nsoptim::GenericLinearizedAdmmOptimizer =========================================== .. doxygenclass:: nsoptim::GenericLinearizedAdmmOptimizer :members: .. _optimizer-admm-varstep: ******************************************************************************** Alternative Direction Method of Moments (ADMM) Optimizer with Variable Step-Size ******************************************************************************** * Supported loss functions: :cpp:class:LsLoss, :cpp:class:WeightedLsLoss * Supported penalty functions: :cpp:class:EnPenalty, :cpp:class:AdaptiveEnPenalty This implementation operates directly on the objective function :math:l(x) + p(x), but adjusts the step size according to :ref: . nsoptim::AdmmVarStepConfiguration ===================================== .. doxygenstruct:: nsoptim::AdmmVarStepConfiguration :members: nsoptim::AdmmVarStepOptimizer ================================= .. doxygenclass:: nsoptim::AdmmVarStepOptimizer :members: ******************************* Dual Augmented Lagrangian (DAL) ******************************* The dual augmented lagrangian algorithm according to :ref: . Supports only sparse coefficients. nsoptim::DalEnConfiguration =============================== .. doxygenstruct:: nsoptim::DalEnConfiguration :members: nsoptim::DalEnOptimizer =========================== .. doxygenclass:: nsoptim::DalEnOptimizer :members: ********** References ********** .. _ref-attouch-2010:  Attouch, H., Bolte, J., Redont, P., and Soubeyran, A. (2010). Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Ojasiewicz inequality. *Mathematics of Operations Research*, 35(2):438–457. .. _ref-bartels-2017:  Bartels, S. and Milicevic, M. (2017). *Alternating direction method of multipliers with variable step sizes*. arXiv:1704.06069 _. .. _ref-tomioka-2011:  Tomioka, R., Suzuki, T., and Sugiyama, M. (2011). Super-linear convergence of dual augmented lagrangian algorithm for sparsity regularized estimation. *Journal of Machine Learning Research*, 12:1537–1586.
 ######## Overview ######## The library porvides algorithms and utilities to optimize non-smooth functions, i.e., to solve .. math:: \operatorname*{arg\,min}_{x} l(x) + \phi(x) where :math:l(x) is a smooth function (called *loss function*) and :math:\phi(x) is a non-smooth function (called *penalty function*). The argument :math:x is the *coefficient*, and the result of any optimization is the *optimum*. ************************************ Dependencies and System Requirements ************************************ The library requires at least a C++ compiler compatible with C++ 11 and an installation of the armadillo _ library for linear algebra.
 breathe \ No newline at end of file
 ... ... @@ -27,27 +27,17 @@ namespace nsoptim { //! Configuration options for the variable-stepsize ADMM algorithm. //! //! The members have the following meaning: //! max_it ... maximum number of iterations allowed. //! tau ... the step size. If negative (the default), use the square L_2 norm of x. //! tau_lower_mult ... lower bound for the step size, defined as a multiple of tau. //! tau_adjustment_lower ... lower bound of the step-size adjustment factor. //! tau_adjustment_upper ... upper bound of the step-size adjustment factor. struct AdmmVarStepConfiguration { int max_it; double tau; double tau_lower_mult; double tau_adjustment_lower; double tau_adjustment_upper; int max_it; //!< Maximum number of iterations allowed. double tau; //!< Initial step size. If negative (the default), use the square L_2 norm of x. double tau_lower_mult; //!< Lower bound for the step size, defined as a multiple of tau. double tau_adjustment_lower; //!< Lower bound of the step-size adjustment factor. double tau_adjustment_upper; //!< Upper bound of the step-size adjustment factor. }; //! Configuration options for the linearized ADMM algorithm. //! //! The members have the following meaning: //! max_it ... maximum number of iterations allowed. struct AdmmLinearConfiguration { int max_it; int max_it; //!< maximum number of iterations allowed. }; namespace admm_optimizer { ... ... @@ -409,18 +399,6 @@ using ProximalOperator = typename std::conditional< //! Compute the EN regression estimate using the alternating direction method of multiplier (ADMM) //! with linearization. This optimizer uses the given proximal operator class ProxOp. //! //! A proximal operator needs to implement the following methods: //! void loss(const LossFunction& loss) ... change the loss function to loss. //! arma::vec operator()(const vec& u, const vec& prev, const double intercept, const double lambda, Metrics * metrics) //! ... get the value of the proximal operator of the function scaled by lambda, evaluated at u. The argument //! prev is the previous value returned by the proximal operator and intercept is the current value //! of the intercept or 0, if the loss does not use an intercept term. //! double ComputeIntercept(const arma::vec& fitted) ... compute the intercept term, given the fitted values. //! double StepSize(const PenaltyFunction& penalty, const double norm_x) ... get the step size required for the //! loss function if the penalty is as given. //! //! See LsProximalOperator and WeightedLsProximalOperator for example implementations of the proximal operator. template class GenericLinearizedAdmmOptimizer : public Optimizer { public: ... ... @@ -636,16 +614,16 @@ class GenericLinearizedAdmmOptimizer : public Optimizer
 ... ... @@ -28,22 +28,18 @@ namespace nsoptim { //! Configuration options for the DAL algorithm. //! //! The members have the following meaning: //! max_it ... maximum number of iterations allowed. //! max_inner_it ... maximum number of inner iterations allowed. //! eta_start_numerator_conservative ... conservative setting for the numerator when computing the initial value of //! the proximity parameters (by numerator / lambda) //! eta_start_numerator_aggressive ... aggressive setting for the numerator when computing the initial value of //! the proximity parameters (by numerator / lambda) //! lambda_relchange_aggressive ... maximum relative change in lambda that allows the use of the aggressive numerator. //! eta_multiplier ... multiplier to scale the proximity parameters at each outer iteration. struct DalEnConfiguration { //! Maximum number of iterations allowed. int max_it; //! Maximum number of inner iterations allowed. int max_inner_it; //! Conservative setting for the numerator when computing the initial value of the proximity parameters. double eta_start_numerator_conservative; //! Aggressive setting for the numerator when computing the initial value of the proximity parameters. double eta_start_numerator_aggressive; //! Maximum relative change in lambda that allows the use of the aggressive numerator. double lambda_relchange_aggressive; //! Multiplier to scale the proximity parameters at each outer iteration. double eta_multiplier; }; ... ...
 ... ... @@ -23,20 +23,26 @@ namespace nsoptim { //! Configuration options for the MM algorithm. //! //! The members have the following meaning: //! max_it ... maximum number of iterations allowed. //! tightening ... the type of tightening for the inner optimization. //! adaptive_tightening_steps ... the number of tightening steps if using adaptive tightening. struct MMConfiguration { //! Type of tightening for inner optimization. enum class TighteningType { //! No tightening, i.e., always use the configured numeric tolerance for the inner optimization. kNone = 0, //! At each iteration make the inner optimization tighter by a constant factor, up until the minimum inner //! tolerance level is reached. kExponential = 1, //! Start with a large inner tolerance and reduce by a constant factor as soon as parameter change in the outer //! optimization is less than the inner tolerance. kAdaptive = 2 }; //! Maximum number of iterations allowed. int max_it; //! Type of tightening for inner optimization. TighteningType tightening; //! Number of tightening steps if using adaptive thightening. int adaptive_tightening_steps; }; ... ...
 ... ... @@ -19,9 +19,9 @@ namespace nsoptim { template class Optimizer { public: using LossFunction = T; using PenaltyFunction = U; using Coefficients = V; using LossFunction = T; //< Loss function type using PenaltyFunction = U; //< Penalty function type using Coefficients = V; //< Coefficients type using Optimum = nsoptim::Optimum; static_assert(traits::is_loss_function::value, ... ...