...
 
Commits (29)
......@@ -13,3 +13,4 @@ nsoptim is a C++ template library for non-smooth optimization, building upon the
optimizer
penalty_functions
loss_functions
metrics
#######
Metrics
#######
During development and experimentation with numerical algorithms it is often helpful to collect performance metrics.
Every algorithm collects metrics specific to its inner workings.
These metrics can, for example, be used to determine good default values for tuning parameters.
****************
nsoptim::Metrics
****************
.. doxygenclass:: nsoptim::_metrics_internal::Metrics
:members:
.. doxygenstruct:: nsoptim::Metric
:members:
*************
Configuration
*************
Collecting metrics comes with a significant overhead and thus performance hit.
The library can be configured to either collect important metrics, detailed metrics, or no metrics at all.
- default: Collect standard metrics.
- *NSOPTIM_METRICS_DISABLED*: Completely disable collection of metrics.
- *NSOPTIM_METRICS_ENABLED*: Enable regular collection of metrics.
- *NSOPTIM_METRICS_DETAILED*: Collect detailed metrics. Comes with significant overhead and not recommended for production.
......@@ -85,6 +85,32 @@ Penalty functions can provide a similar member to return a convex surrogate.
.. doxygenclass:: nsoptim::MMOptimizer
:members:
*************************************************
Augmented Least Angle Regression (LARS) Optimizer
*************************************************
* Supported loss functions: :cpp:class:`LsRegressionLoss`, :cpp:class:`WeightedLsRegressionLoss`
* Supported penalty functions: :cpp:class:`EnPenalty`, :cpp:class:`AdaptiveEnPenalty`
Augmented LARS uses the Least Angle Regression algorithm :ref:`[3] <ref-efron-2004>`. for LASSO regression on an augmented data set.
The Ridge-penalty is absorbed into an augmented data set :math:`(\tilde y, \tilde X)` by
.. math::
\tilde X = \begin{pmatrix} X \\ \lambda I_{p \times p} \end{pmatrix}
\,,\quad
\tilde y = (y_1, \dotsc, y_n, 0, \dotsc, 0)
The augmented LARS optimizer is numerically very stable and does not need any tuning.
The algorithm solves systems of linear equations exactly (numerically) along the regularization path, hence computational complexity is of the same order as ordinary least-squares.
It scales cubic with the number of variables and quadratic with the number of observations.
For Ridge-only penalties, the algorithm is specialized for faster computation.
``nsoptim::AugmentedLarsOptimizer``
===================================
.. doxygenclass:: nsoptim::AugmentedLarsOptimizer
:members:
*******************************************************************
Linearized Alternative Direction Method of Moments (ADMM) Optimizer
*******************************************************************
......@@ -103,7 +129,7 @@ Linearized ADMM works for objective functions that can be written as :math:`l(X
Especially if :math:`X` is "wide" (i.e., has more columns than rows), the proximal operator for this problem is usually much quicker to compute than for :math:`\tilde l (b) = l(A b)`.
More information on the properties of the algorithm can be found in :ref:`[1] <ref-attouch-2010>`.
The linearized ADMM algorithm requires a proper implementation of the proximal operator that can handle the given loss and penalty functions.
The linearized ADMM algorithm requires a proper implementation of the proximal operator that can handle the given loss function.
A proximal operator needs to follow the following interface:
Proximal Operator
......@@ -115,19 +141,19 @@ Proximal Operator
Change the loss to `loss`.
.. cpp:function:: arma::vec operator()(const arma::vec& u, const arma::vec& prev, const double intercept, \
const double lambda, nsoptim::Metrics* metrics)
const double scale, nsoptim::Metrics* metrics)
Get the value of the proximal operator of the function scaled by `lambda`, evaluated at `u`.
Get the value of the proximal operator of the function scaled by ``scale``, evaluated at ``u``.
The argument `prev` is the previous value returned by the proximal operator and `intercept` is the current
value of the intercept or 0, if the loss does not use an intercept term.
.. cpp:function:: double ComputeIntercept(const arma::vec& fitted)
.. cpp:function:: double OperatorScaling()
Compute the intercept term, given the fitted values.
Get the operator scaling.
.. cpp:function:: double StepSize(const PenaltyFunction& penalty, const double norm_x)
.. cpp:function:: double PenaltyScaling()
Get the step size required for the set loss and the given the penalty.
Get the scaling applied to the penalty parameter.
``nsoptim::AdmmLinearConfiguration``
====================================
......@@ -164,7 +190,7 @@ according to :ref:`[2] <ref-bartels-2017>`.
*******************************
Dual Augmented Lagrangian (DAL)
*******************************
The dual augmented lagrangian algorithm according to :ref:`[3] <ref-tomioka-2011>`.
The dual augmented lagrangian algorithm according to :ref:`[4] <ref-tomioka-2011>`.
Supports only sparse coefficients.
......@@ -190,7 +216,11 @@ References
[2] Bartels, S. and Milicevic, M. (2017). *Alternating direction method of multipliers with variable step sizes*. `arXiv:1704.06069 <https://arxiv.org/abs/1704.06069>`_.
.. _ref-efron-2004:
[3] Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression. *The Annals of Statistics* 32(2):407-499.
.. _ref-tomioka-2011:
[3] Tomioka, R., Suzuki, T., and Sugiyama, M. (2011). Super-linear convergence of dual augmented lagrangian algorithm for sparsity regularized estimation. *Journal of Machine Learning Research*, 12:1537–1586.
[4] Tomioka, R., Suzuki, T., and Sugiyama, M. (2011). Super-linear convergence of dual augmented lagrangian algorithm for sparsity regularized estimation. *Journal of Machine Learning Research*, 12:1537–1586.
......@@ -17,6 +17,11 @@
#endif
#include <nsoptim_forward.hpp>
#include <nsoptim/armadillo.hpp>
#include <nsoptim/rcpp_integration.hpp>
#include <nsoptim/objective.hpp>
#include <nsoptim/container.hpp>
#include <nsoptim/optimizer.hpp>
#endif // NSOPTIM_HPP_
......@@ -9,8 +9,13 @@
#ifndef NSOPTIM_ARMADILLO_HPP_
#define NSOPTIM_ARMADILLO_HPP_
#define ARMA_USE_CXX11 1
#define ARMA_DONT_USE_OPENMP 1
#ifndef ARMA_USE_CXX11
# define ARMA_USE_CXX11 1
#endif
#ifndef ARMA_DONT_USE_OPENMP
# define ARMA_DONT_USE_OPENMP 1
#endif
#ifdef __clang__
# pragma clang diagnostic push
......@@ -19,7 +24,7 @@
#ifdef HAVE_RCPP
// For RCPP
# include <RcppArmadilloForward.h>
# include <RcppArmadillo.h>
#else
// For stand-alone
# include <armadillo>
......
//
// armadillo_forward.hpp
// nsoptim
//
// Created by David Kepplinger on 2018-11-30.
// Copyright © 2018 David Kepplinger. All rights reserved.
//
#ifndef NSOPTIM_ARMADILLO_FORWARD_HPP_
#define NSOPTIM_ARMADILLO_FORWARD_HPP_
#define ARMA_USE_CXX11 1
#define ARMA_DONT_USE_OPENMP 1
#ifdef __clang__
# pragma clang diagnostic push
# pragma clang diagnostic ignored "-Weverything"
#endif
#ifdef HAVE_RCPP
// For RCPP
# include <RcppArmadilloForward.h>
#else
// For stand-alone
# include <armadillo>
#endif // HAVE_RCPP
#ifdef __clang__
# pragma clang diagnostic pop
#endif
#endif // NSOPTIM_ARMADILLO_FORWARD_HPP_
......@@ -9,9 +9,14 @@
#ifndef NSOPTIM_CONFIG_HPP_
#define NSOPTIM_CONFIG_HPP_
#define NSOPTIM_METRICS_LEVEL 2
#define NSOPTIM_METRICS_LEVEL 1
#ifndef NSOPTIM_DETAILED_METRICS
#ifdef NSOPTIM_METRICS_DETAILED
# undef NSOPTIM_METRICS_LEVEL
# define NSOPTIM_METRICS_LEVEL 2
#endif
#ifdef NSOPTIM_METRICS_ENABLED
# undef NSOPTIM_METRICS_LEVEL
# define NSOPTIM_METRICS_LEVEL 1
#endif
......
//
// forward.hpp
// nsoptim
//
// Created by David Kepplinger on 2018-11-30.
// Copyright © 2018 David Kepplinger. All rights reserved.
//
#ifndef NSOPTIM_CONTAINER_FORWARD_HPP_
#define NSOPTIM_CONTAINER_FORWARD_HPP_
#include <nsoptim/config.hpp>
#include <nsoptim/armadillo_forward.hpp>
namespace nsoptim {
//! Full definition at nsoptim/container/regression_coefficients.hpp
template <class T> class RegressionCoefficients;
//! Full definition at nsoptim/container/regression_coefficients.hpp
template <class T> class RegressionCoefficients;
//! Full definition at nsoptim/container/data.hpp
class PredictorResponseData;
namespace _metrics_internal {
//! Full definition at nsoptim/container/metrics.hpp
template<int> class Metrics;
} // namespace _metrics_internal
//! Export the correct Metrics collection based on the configuration.
using Metrics = _metrics_internal::Metrics<NSOPTIM_METRICS_LEVEL>;
} // namespace nsoptim
#endif // NSOPTIM_CONTAINER_FORWARD_HPP_
......@@ -16,17 +16,23 @@
#include <nsoptim/config.hpp>
namespace nsoptim {
//! Simple wrapper around a named metric.
//! A named metric object.
template<typename T>
struct Metric {
//! Construct a named metric object.
Metric(const std::string& _name, const T _value) : name(_name), value(_value) {}
//! Default move constructor.
Metric(Metric&&) = default;
//! Default move assignment.
Metric& operator=(Metric&&) = default;
//! Default copy constructor.
Metric(const Metric&) = default;
//! Default copy assignment.
Metric& operator=(const Metric&) = default;
std::string name;
T value;
std::string name; //!< Name of the metric.
T value; //!< Value of the metric.
};
//! A list of Metrics
......@@ -39,7 +45,6 @@ struct MetricsContainer {
MetricList<std::string> str_;
};
//! An iterator that iterates sequentially over two std::forward_list objects.
namespace _metrics_internal {
template<typename T>
class MetricsIterator : public std::iterator<std::forward_iterator_tag, const Metric<T>> {
......@@ -99,14 +104,6 @@ class MetricsIterator : public std::iterator<std::forward_iterator_tag, const Me
list_iterator it_;
};
//! An iterator over an empty set.
template<typename T>
class EmptyIterator {
public:
constexpr T const * begin() const noexcept { return nullptr; }
constexpr T const * end() const noexcept { return nullptr; }
};
//! Enumeration of states for metrics.
enum class HasMetrics {
kNo,
......@@ -114,69 +111,103 @@ enum class HasMetrics {
kYes
};
//! A collection of metrics
template<int>
class Metrics {
//! A dummy iterator proxy over an empty set.
template<typename T>
class IteratorProxy {
public:
constexpr T const * begin() const noexcept { return nullptr; }
constexpr T const * end() const noexcept { return nullptr; }
};
public:
Metrics() noexcept {}
//! Construct a named collection of metrics.
explicit Metrics(const std::string&) noexcept {}
//! Get the name(space) of this Metrics object.
//!
//! @return the name(space).
//! @return The name(space).
constexpr char const * name() const noexcept {
return "";
}
//! Create a sub-collection of metrics and add it to this collection.
//!
//! @param name the name of the new sub-collection of metrics.
//! @return a reference to the newly created metrics.
//! @param name Name of the new sub-collection of metrics.
//! @return Reference to the newly created metrics.
Metrics& CreateSubMetrics(const std::string&) noexcept {
return *this;
}
//! Add a sub-collection of metrics to the collection.
//! Add a sub-collection of Metrics to the collection.
//!
//! @param metrics a collection of metrics to be added to this metrics collection.
//! @param metrics A collection of metrics to be added to this metrics collection.
void AddSubMetrics(const Metrics&) const noexcept {}
//! Add a metric to the collection.
//! Add a floating-point metric to the collection.
//!
//! @param name the name of the metric.
//! @param value the value for the metric.
//! @param name Name of the metric.
//! @param value Value for the metric.
void AddMetric(const std::string&, const double) const noexcept {}
//! Add a metric to the collection.
//! Add an integer metric to the collection.
//!
//! @param name the name of the metric.
//! @param value the value for the metric.
//! @param name Name of the metric.
//! @param value Value for the metric.
void AddMetric(const std::string&, const int) const noexcept {}
//! Add a metric to the collection.
//! Add a string metric to the collection.
//!
//! @param name the name of the metric.
//! @param value the value for the metric.
//! @param name Name of the metric.
//! @param value Value for the metric.
void AddMetric(const std::string&, const std::string&) const noexcept {}
//! Add a detailed metric to the collection.
//! Add a floating-point detail to the collection.
//!
//! @param name the name of the detailed metric.
//! @param value the value for the detailed metric.
//! @param name Name of the detailed metric.
//! @param value Value for the detailed metric.
void AddDetail(const std::string&, const double) const noexcept {}
//! Add an integer detail to the collection.
//!
//! @param name Name of the detailed metric.
//! @param value Value for the detailed metric.
void AddDetail(const std::string&, const int) const noexcept {}
//! Add a string detail to the collection.
//!
//! @param name Name of the detailed metric.
//! @param value Value for the detailed metric.
void AddDetail(const std::string&, const std::string&) const noexcept {}
constexpr EmptyIterator<Metric<double>> DoubleMetrics() const noexcept {
return EmptyIterator<Metric<double>>();
//! Iterate over all floating-point metrics and details in this collection.
//!
//! @return A proxy providing a forward-iterator over @ref Metric "Metric<double>" objects.
constexpr IteratorProxy<Metric<double>> DoubleMetrics() const noexcept {
return IteratorProxy<Metric<double>>();
}
constexpr EmptyIterator<Metric<int>> IntegerMetrics() const noexcept {
return EmptyIterator<Metric<int>>();
//! Iterate over all integer metrics and details in this collection.
//!
//! @return A proxy providing a forward-iterator over @ref Metric "Metric<int>" objects.
constexpr IteratorProxy<Metric<int>> IntegerMetrics() const noexcept {
return IteratorProxy<Metric<int>>();
}
constexpr EmptyIterator<Metric<std::string>> StringMetrics() const noexcept {
return EmptyIterator<Metric<std::string>>();
//! Iterate over all string metrics and details in this collection.
//!
//! @return A proxy providing a forward-iterator over @ref Metric "Metric<std::string>" objects.
constexpr IteratorProxy<Metric<std::string>> StringMetrics() const noexcept {
return IteratorProxy<Metric<std::string>>();
}
constexpr EmptyIterator<Metrics> SubMetrics() const noexcept {
return EmptyIterator<Metrics>();
//! Iterate over all nested metrics in this collection.
//!
//! @return A proxy providing a forward-iterator over Metrics objects.
constexpr IteratorProxy<Metrics> SubMetrics() const noexcept {
return IteratorProxy<Metrics>();
}
};
......@@ -446,9 +477,6 @@ class Metrics<2> {
std::forward_list<Metrics> sub_metrics_;
};
} // namespace _metrics_internal
//! Export the correct Metrics collection based on the configuration.
using Metrics = _metrics_internal::Metrics<NSOPTIM_METRICS_LEVEL>;
} // namespace nsoptim
#endif // NSOPTIM_CONTAINER_METRICS_HPP_
//
// forward.hpp
// nsoptim
//
// Created by David Kepplinger on 2018-11-30.
// Copyright © 2018 David Kepplinger. All rights reserved.
//
#ifndef NSOPTIM_OBJECTIVE_FORWARD_HPP_
#define NSOPTIM_OBJECTIVE_FORWARD_HPP_
#include <nsoptim/config.hpp>
#include <nsoptim/armadillo_forward.hpp>
namespace nsoptim {
//! Full definition at nsoptim/objective/ls_regression_loss.hpp
class WeightedLsRegressionLoss;
class LsRegressionLoss;
//! Full definition at nsoptim/objective/en_penalty.hpp
class EnPenalty;
class LassoPenalty;
class RidgePenalty;
//! Full definition at nsoptim/objective/adaptive_en_penalty.hpp
class AdaptiveEnPenalty;
class AdaptiveLassoPenalty;
} // namespace nsoptim
#endif // NSOPTIM_OBJECTIVE_FORWARD_HPP_
......@@ -13,7 +13,7 @@
#include <memory>
#include <nsoptim/optimizer/optimum.hpp>
#include <nsoptim/optimizer/augmented_ridge.hpp>
#include <nsoptim/optimizer/auglars.hpp>
#include <nsoptim/optimizer/mm.hpp>
#include <nsoptim/optimizer/dal.hpp>
#include <nsoptim/optimizer/admm.hpp>
......
This diff is collapsed.
This diff is collapsed.
//
// optim_augridge.hpp
// nsoptim
//
// Created by David Kepplinger on 2019-01-02.
// Copyright © 2019 David Kepplinger. All rights reserved.
//
#ifndef NSOPTIM_OPTIMIZER_AUGMENTED_RIDGE_HPP_
#define NSOPTIM_OPTIMIZER_AUGMENTED_RIDGE_HPP_
#include <exception>
#include <type_traits>
#include <nsoptim/armadillo.hpp>
#include <nsoptim/container/regression_coefficients.hpp>
#include <nsoptim/container/data.hpp>
#include <nsoptim/optimizer/optimizer_base.hpp>
#include <nsoptim/optimizer/optimum.hpp>
#include <nsoptim/objective/ls_regression_loss.hpp>
#include <nsoptim/objective/en_penalty.hpp>
#include <nsoptim/traits/traits.hpp>
namespace nsoptim {
//! Compute the Ridge regression estimate using standard linear algebra on the augmented response vector and predictor
//! matrix.
template<class LossFunction = WeightedLsRegressionLoss>
class AugmentedRidgeOptimizer : public Optimizer<LossFunction, RidgePenalty, RegressionCoefficients<arma::vec>> {
using Base = Optimizer<LossFunction, RidgePenalty, RegressionCoefficients<arma::vec>>;
using LossFunctionPtr = std::unique_ptr<LossFunction>;
using RidgePenaltyPtr = std::unique_ptr<RidgePenalty>;
static_assert(traits::is_ls_regression_loss<LossFunction>::value, "LossFunction must be a LS-type loss.");
public:
using Coefficients = typename Base::Coefficients;
using Optimum = typename Base::Optimum;
//! Ininitialize the optimizer using the given (weighted) LS loss function and the Ridge penalty.
//!
//! @param loss a weighted LS loss function.
//! @param penalty the Ridge penalty.
AugmentedRidgeOptimizer() noexcept : loss_(nullptr), penalty_(nullptr), new_data_(false) {}
//! Ininitialize the optimizer using the given (weighted) LS loss function and the Ridge penalty.
//!
//! @param loss a weighted LS loss function.
//! @param penalty the Ridge penalty.
AugmentedRidgeOptimizer(const LossFunction& loss, const RidgePenalty& penalty) noexcept
: loss_(LossFunctionPtr(new LossFunction(loss))),
penalty_(RidgePenaltyPtr(new RidgePenalty(penalty))),
new_data_(true) {
PrepareAugmentedData();
}
//! Default copy constructor.
//!
//! The copied optimizer will share the identical loss and penalty functions after construction.
//! In case the loss or penalty function are mutated in any way, the change will affect both optimizers.
//! If the loss/penalty function is changed on one of the optimizers (using the `loss()` or `penalty()` methods),
//! the two optimizers will *not* share the new loss/penalty function.
AugmentedRidgeOptimizer(const AugmentedRidgeOptimizer& other) noexcept
: loss_(other.loss_? LossFunctionPtr(new LossFunction(*other.loss_)) : nullptr),
penalty_(other.penalty_ ? RidgePenaltyPtr(new RidgePenalty(*other.penalty_)) : nullptr),
augmented_data_(other.augmented_data_), new_data_(other.new_data_),
mean_x_(other.mean_x_), mean_y_(other.mean_y_) {}
//! Default copy assignment.
//!
//! The copied optimizer will share the identical loss and penalty functions after construction.
//! In case the loss or penalty function are mutated in any way, the change will affect both optimizers.
//! If the loss/penalty function is changed on one of the optimizers (using the `loss()` or `penalty()` methods),
//! the two optimizers will *not* share the new loss/penalty function.
AugmentedRidgeOptimizer& operator=(const AugmentedRidgeOptimizer& other) = default;
//! Default move constructor.
AugmentedRidgeOptimizer(AugmentedRidgeOptimizer&& other) = default;
//! Default move assignment operator.
AugmentedRidgeOptimizer& operator=(AugmentedRidgeOptimizer&& other) = default;
~AugmentedRidgeOptimizer() = default;
void Reset() {}
LossFunction& loss() const {
if (!loss_) {
throw std::logic_error("no loss set");
}
return *loss_;
}
void loss(const LossFunction& loss) noexcept {
new_data_ = !loss_ || (&loss.data() != &loss_->data());
loss_.reset(new LossFunction(loss));
if (new_data_) {
PrepareAugmentedData();
}
}
RidgePenalty& penalty() const {
if (!penalty_) {
throw std::logic_error("no penalty set");
}
return *penalty_;
}
void penalty(const RidgePenalty& penalty) noexcept {
penalty_.reset(new RidgePenalty(penalty));
}
AugmentedRidgeOptimizer::Optimum Optimize() {
using IsWeightedTag = typename traits::is_weighted<LossFunction>::type;
if (!loss_) {
throw std::logic_error("no loss set");
}
if (!penalty_) {
throw std::logic_error("no penalty set");
}
const PredictorResponseData& data = loss_->data();
if (new_data_) {
UpdateAugmentedData(IsWeightedTag{});
}
new_data_ = false;
UpdateRidgePart(IsWeightedTag{});
RegressionCoefficients<arma::vec> coefs;
bool success;
if (penalty_->lambda() > 0) {
success = arma::solve(coefs.beta, augmented_data_.cx(), augmented_data_.cy());
} else {
success = arma::solve(coefs.beta, augmented_data_.cx().head_rows(data.n_obs()),
augmented_data_.cy().head_rows(data.n_obs()));
}
// Compute the intercept
if (loss_->IncludeIntercept()) {
coefs.intercept = ComputeIntercept(coefs, IsWeightedTag{});
}
return MakeOptimum(*loss_, *penalty_, coefs, success ? OptimumStatus::kOk : OptimumStatus::kError,
success ? "" : "Could not solve system of linear equations");
}
AugmentedRidgeOptimizer::Optimum Optimize(const RegressionCoefficients<arma::vec>&) {
return Optimize();
}
private:
//! Prepare the augmented data by creating a large enough predictor matrix with 0's and a large enough response
//! vector with 0's and compute the means of the response and the predictors.
void PrepareAugmentedData() {
const PredictorResponseData& data = loss_->data();
augmented_data_.x().zeros(data.n_obs() + data.n_pred(), data.n_pred());
augmented_data_.y().zeros(data.n_obs() + data.n_pred());
mean_x_ = arma::mean(data.cx(), 0);
mean_y_ = arma::mean(data.cy());
}
void UpdateAugmentedData(std::true_type) {
const PredictorResponseData& data = loss_->data();
if (loss_->IncludeIntercept()) {
// The observations have different weights. Make the predictor matrix orthogonal to the centered response.
arma::mat weighted_x = data.cx().each_row() - mean_x_;
weighted_x.each_col() %= loss_->sqrt_weights();
augmented_data_.x().head_rows(data.n_obs()) = weighted_x -
loss_->sqrt_weights() * loss_->sqrt_weights().t() * weighted_x / data.n_obs();
// Apply weights to the response
augmented_data_.y().head_rows(data.n_obs()) = loss_->sqrt_weights() % (data.cy() - mean_y_);
} else {
augmented_data_.x().head_rows(data.n_obs()) = data.cx().each_col() % loss_->sqrt_weights();
augmented_data_.y().head_rows(data.n_obs()) = data.cy() % loss_->sqrt_weights();
}
}
void UpdateAugmentedData(std::false_type) {
const PredictorResponseData& data = loss_->data();
if (loss_->IncludeIntercept()) {
// The weights are all identical. Center the predictors and the response.
augmented_data_.x().head_rows(data.n_obs()) = data.cx().each_row() - mean_x_;
augmented_data_.y().head_rows(data.n_obs()) = data.cy() - mean_y_;
} else {
augmented_data_.x().head_rows(data.n_obs()) = data.cx();
augmented_data_.y().head_rows(data.n_obs()) = data.cy();
}
}
void UpdateRidgePart(std::true_type) {
const PredictorResponseData& data = loss_->data();
if (penalty_->lambda() > 0) {
augmented_data_.x().tail_rows(data.n_pred()).diag().fill(
std::sqrt(data.n_obs() * penalty_->lambda() / loss_->mean_weight()));
}
}
void UpdateRidgePart(std::false_type) {
const PredictorResponseData& data = loss_->data();
if (penalty_->lambda() > 0) {
augmented_data_.x().tail_rows(data.n_pred()).diag().fill(
std::sqrt(data.n_obs() * penalty_->lambda()));
}
}
double ComputeIntercept(const RegressionCoefficients<arma::vec>& coefs, std::true_type) const {
const PredictorResponseData& data = loss_->data();
const arma::vec residuals = data.cy() - data.cx() * coefs.beta;
return arma::mean(loss_->sqrt_weights() % loss_->sqrt_weights() % residuals);
}
double ComputeIntercept(const RegressionCoefficients<arma::vec>& coefs, std::false_type) const {
const PredictorResponseData& data = loss_->data();
return arma::mean(data.cy() - data.cx() * coefs.beta - coefs.intercept);
}
LossFunctionPtr loss_;
RidgePenaltyPtr penalty_;
PredictorResponseData augmented_data_;
bool new_data_;
arma::rowvec mean_x_;
double mean_y_;
};
} // namespace nsoptim
#endif // NSOPTIM_OPTIMIZER_AUGMENTED_RIDGE_HPP_
......@@ -12,6 +12,7 @@
#include <type_traits>
#include <memory>
#include <algorithm>
#include <limits>
#include <nsoptim/armadillo.hpp>
#include <nsoptim/traits/traits.hpp>
......@@ -77,7 +78,8 @@ inline double DualLoss(const arma::vec& a, const arma::vec& y);
//! @param eps the convergence threshold.
//! @param x a pointer to the solution vector (`x` in the equation). Is used as the starting point for the PCG method.
//! @return the number of iterations or `kPhiStepDirInversionFailed` if not converged.
inline int SolvePcg(const arma::mat &A, const arma::vec &b, const arma::mat& precond, const double eps, arma::vec* x);
inline int SolvePcg(const arma::mat &A, const arma::vec &b, const arma::mat& precond,
const double eps, arma::vec* x) noexcept;
//! Proxy class to manage the actual data.
//! The default implementation simply proxies to the data used by the loss function.
......@@ -471,12 +473,16 @@ inline double DualLoss(const arma::vec& a, const arma::vec& y) {
//! @param eps the convergence threshold.
//! @param x a pointer to the solution vector (`x` in the equation). Is used as the starting point for the PCG method.
//! @return the number of iterations or `kPhiStepDirInversionFailed` if not converged.
inline int SolvePcg(const arma::mat &A, const arma::vec &b, const arma::mat& precond, const double eps, arma::vec* x) {
int iter = 0;
auto normb = arma::norm(b);
inline int SolvePcg(const arma::mat &A, const arma::vec &b, const arma::mat& precond, const double eps,
arma::vec* x) noexcept {
using arma::norm;
using arma::dot;
arma::uword iter = 0;
auto normb = norm(b);
arma::vec r = b - A * (*x);
if (normb == 0.0) {
if (normb < std::numeric_limits<double>::epsilon()) {
normb = 1;
}
......@@ -495,19 +501,17 @@ inline int SolvePcg(const arma::mat &A, const arma::vec &b, const arma::mat& pre
*x += alpha * p;
const arma::vec new_r = r - alpha * tmp;
// r -= alpha * tmp;
if (norm(new_r) <= thresh) {
return iter;
return static_cast<int>(iter);
}
const arma::vec z = precond * new_r;
const auto beta = arma::dot(z, new_r - r) / abs;
abs = arma::dot(new_r, z);
const auto beta = dot(z, new_r - r) / abs;
abs = dot(new_r, z);
p = z + beta * p;
r = new_r;
} while (++iter <= static_cast<int>(b.n_elem));
} while (++iter <= b.n_elem);
return kPhiStepDirInversionFailed;
}
......
This diff is collapsed.
......@@ -155,13 +155,8 @@ class ExponentialTightening : public InnerToleranceTightening<Optimizer> {
virtual ~ExponentialTightening() noexcept = default;
void Tighten(const double outer_change) noexcept override {
double new_tol = this->current_tolerance() * multiplier_;
if (outer_change < new_tol) {
new_tol = std::max(this->inner_tolerance(), outer_change);
} else if (new_tol < this->inner_tolerance()) {
new_tol = this->inner_tolerance();
}
this->PropagateUpdate(new_tol);
const double new_tol = this->current_tolerance() * multiplier_;
this->PropagateUpdate(std::max(this->inner_tolerance(), std::min(new_tol, outer_change)));
}
void FastTighten() noexcept override {
......@@ -346,6 +341,7 @@ class MMOptimizer : public Optimizer<LossFunction, PenaltyFunction, Coefficients
//! @return information about the optimum.
Optimum Optimize(const Coefficients& start) {
coefs_ = start;
optimizer_.Reset();
return Optimize(config_.max_it);
}
......@@ -411,22 +407,23 @@ class MMOptimizer : public Optimizer<LossFunction, PenaltyFunction, Coefficients
// Often, the results from the convex surrogate can be used to quickly evaluate the loss/penalty function.
double objf_value = loss_->Evaluate(coefs_) + penalty_->Evaluate(coefs_);
double rel_difference = 0;
bool last_iteration = false;
bool final_iterations = false;
int iter = 0;
while (iter++ < max_it) {
// UpdateInnerConvergenceTolerance(rel_tol, IsIterativeAlgorithmTag{});
auto&& iter_metrics = metrics->CreateSubMetrics("mm_iteration");
// Compute the minimizer of the convex surrogates.
auto optimum = optimizer_.Optimize();
if (optimum.metrics) {
metrics->AddSubMetrics(std::move(*optimum.metrics));
iter_metrics.AddSubMetrics(std::move(*optimum.metrics));
optimum.metrics.reset();
}
// Check for any problems.
if (optimum.status == OptimumStatus::kError) {
metrics->AddDetail("rel_difference", rel_difference);
metrics->AddDetail("final_rel_difference", rel_difference);
metrics->AddDetail("final_innner_tol", tightener->current_tolerance());
metrics->AddMetric("iter", iter);
return MakeOptimum(*loss_, *penalty_, coefs_, std::move(metrics), OptimumStatus::kError,
......@@ -436,19 +433,23 @@ class MMOptimizer : public Optimizer<LossFunction, PenaltyFunction, Coefficients
// Check for convergence.
rel_difference = Difference(coefs_, optimum.coefs, LossHasDifferenceOp{}, PenaltyHasDifferenceOp{});
// The solutions are good enough. If we can make the inner optimizer tighter, add one iteration and then
iter_metrics.AddDetail("iter", iter);
iter_metrics.AddDetail("rel_difference", rel_difference);
iter_metrics.AddDetail("inner_tol", tightener->current_tolerance());
// The solutions are good enough. If we can make the inner optimizer tighter, add a few iteration and then
// return the obtained optimum.
if (last_iteration || rel_difference < convergence_tolerance_) {
if (last_iteration || !tightener->CanTightenFurther()) {
if (rel_difference < convergence_tolerance_) {
if (final_iterations || !tightener->CanTightenFurther()) {
coefs_ = std::move(optimum.coefs);
metrics->AddDetail("rel_difference", rel_difference);
metrics->AddDetail("final_rel_difference", rel_difference);
metrics->AddDetail("final_innner_tol", tightener->current_tolerance());
metrics->AddMetric("iter", iter);
// The value of the convex surrogate should be close enough (actually it should be equal) to the
// true objective. Therefore, there is no need to re-evaluate the loss.
return MakeOptimum(*loss_, *penalty_, coefs_, optimum.objf_value, std::move(metrics));
} else {
last_iteration = true;
final_iterations = true;
tightener->FullyTighten();
}
}
......@@ -463,12 +464,13 @@ class MMOptimizer : public Optimizer<LossFunction, PenaltyFunction, Coefficients
// If the inner convergence tolerance is already ridiculously small. Probably zig-zagging around
// the optimum.
if (!tightener->CanTightenFurther()) {
metrics->AddDetail("rel_difference", rel_difference);
metrics->AddDetail("final_rel_difference", rel_difference);
metrics->AddDetail("final_innner_tol", tightener->current_tolerance());
metrics->AddMetric("iter", iter);
return MakeOptimum(*loss_, *penalty_, coefs_, objf_value, std::move(metrics),
OptimumStatus::kOk, "Possibly imprecise solution.");
}
iter_metrics.AddDetail("tighten_faster", "yes");
tightener->FastTighten();
continue;
}
......@@ -481,14 +483,22 @@ class MMOptimizer : public Optimizer<LossFunction, PenaltyFunction, Coefficients
tightener->Tighten(rel_difference);
// Update the convex surrogates for the internal optimizer.
optimizer_.loss(loss_->GetConvexSurrogate(coefs_));
try {
optimizer_.loss(loss_->GetConvexSurrogate(coefs_));
} catch(...) {
metrics->AddMetric("final_rel_difference", iter);
metrics->AddDetail("rel_difference", rel_difference);
metrics->AddDetail("final_innner_tol", tightener->current_tolerance());
return MakeOptimum(*loss_, *penalty_, coefs_, std::move(metrics), OptimumStatus::kWarning,
"MM-algorithm did not converge");
}
optimizer_.penalty(penalty_->GetConvexSurrogate(coefs_));
// Retain value of the objective function to check for improvement.
objf_value = loss_->Evaluate(coefs_) + penalty_->Evaluate(coefs_);
}
metrics->AddMetric("iter", iter);
metrics->AddMetric("final_rel_difference", iter);
metrics->AddDetail("rel_difference", rel_difference);
metrics->AddDetail("final_innner_tol", tightener->current_tolerance());
return MakeOptimum(*loss_, *penalty_, coefs_, std::move(metrics), OptimumStatus::kWarning,
......
......@@ -9,20 +9,9 @@
#ifndef NSOPTIM_RCPP_INTEGRATION_HPP_
#define NSOPTIM_RCPP_INTEGRATION_HPP_
#include <nsoptim_forward.hpp>
#include <nsoptim/armadillo.hpp>
#ifdef HAVE_RCPP
#ifdef __clang__
# pragma clang diagnostic push
# pragma clang diagnostic ignored "-Weverything"
#endif
#include <RcppArmadillo.h>
#ifdef __clang__
# pragma clang diagnostic pop
#endif
namespace Rcpp {
//! Wrap a sparse vector into a Matrix::sparseVector object
template <typename T>
......
......@@ -10,10 +10,9 @@
#define NSOPTIM_FORWARD_HPP_
#include <nsoptim/config.hpp>
#include <nsoptim/armadillo.hpp>
#include <nsoptim/objective.hpp>
#include <nsoptim/container.hpp>
#include <nsoptim/optimizer.hpp>
#include <nsoptim/armadillo_forward.hpp>
#include <nsoptim/container/forward.hpp>
#include <nsoptim/objective/forward.hpp>
#ifdef HAVE_RCPP
namespace Rcpp {
......