# ifndef CPPAD_CORE_OPTIMIZE_HPP # define CPPAD_CORE_OPTIMIZE_HPP /* -------------------------------------------------------------------------- CppAD: C++ Algorithmic Differentiation: Copyright (C) 2003-20 Bradley M. Bell CppAD is distributed under the terms of the Eclipse Public License Version 2.0. This Source Code may also be made available under the following Secondary License when the conditions for such availability set forth in the Eclipse Public License, Version 2.0 are satisfied: GNU General Public License, Version 2.0 or later. ---------------------------------------------------------------------------- */ # define CPPAD_CORE_OPTIMIZE_PRINT_RESULT 0 /* $begin optimize$$ $spell enum jac bool Taylor CppAD cppad std const onetape op optimizer $$ $section Optimize an ADFun Object Tape$$ $head Syntax$$ $icode%f%.optimize() %$$ $icode%f%.optimize(%options%) %$$ $icode%flag% = %f%.exceed_collision_limit() %$$ $head Purpose$$ The operation sequence corresponding to an $cref ADFun$$ object can be very large and involve many operations; see the size functions in $cref seq_property$$. The $icode%f%.optimize%$$ procedure reduces the number of operations, and thereby the time and the memory, required to compute function and derivative values. $head f$$ The object $icode f$$ has prototype $codei% ADFun<%Base%> %f% %$$ $head options$$ This argument has prototype $codei% const std::string& %options% %$$ The default for $icode options$$ is the empty string. If it is present, it must consist of one or more of the options below separated by a single space character. $subhead no_conditional_skip$$ The $code optimize$$ function can create conditional skip operators to improve the speed of conditional expressions; see $cref/optimize/CondExp/Optimize/$$. If the sub-string $code no_conditional_skip$$ appears in $icode options$$, conditional skip operations are not be generated. This may make the optimize routine use significantly less memory and take less time to optimize $icode f$$. If conditional skip operations are generated, it may save a significant amount of time when using $icode f$$ for $cref forward$$ or $cref reverse$$ mode calculations; see $cref number_skip$$. $subhead no_compare_op$$ If the sub-string $code no_compare_op$$ appears in $icode options$$, comparison operators will be removed from the optimized function. These operators are necessary for the $cref compare_change$$ functions to be meaningful. On the other hand, they are not necessary, and take extra time, when the compare_change functions are not used. $subhead no_print_for_op$$ If the sub-string $code no_compare_op$$ appears in $icode options$$, $cref PrintFor$$ operations will be removed form the optimized function. These operators are useful for reporting problems evaluating derivatives at independent variable values different from those used to record a function. $subhead no_cumulative_sum_op$$ If this sub-string appears, no cumulative sum operations will be generated during the optimization; see $cref optimize_cumulative_sum.cpp$$. $subhead collision_limit=value$$ If this substring appears, where $icode value$$ is a sequence of decimal digits, the optimizer's hash code collision limit will be set to $icode value$$. When the collision limit is reached, the expressions with that hash code are removed and a new lists of expressions with that has code is started. The larger $icode value$$, the more identical expressions the optimizer can recognize, but the slower the optimizer may run. The default for $icode value$$ is $code 10$$. $head Re-Optimize$$ Before 2019-06-28, optimizing twice was not supported and would fail if cumulative sum operators were present after the first optimization. This is now supported but it is not expected to have much benefit. If you find a case where it does have a benefit, please inform the CppAD developers of this. $head Efficiency$$ If a $cref/zero order forward/forward_zero/$$ calculation is done during the construction of $icode f$$, it will require more memory and time than required after the optimization procedure. In addition, it will need to be redone. For this reason, it is more efficient to use $codei% ADFun<%Base%> %f%; %f%.Dependent(%x%, %y%); %f%.optimize(); %$$ instead of $codei% ADFun<%Base%> %f%(%x%, %y%) %f%.optimize(); %$$ See the discussion about $cref/sequence constructors/FunConstruct/Sequence Constructor/$$. $head Taylor Coefficients$$ Any Taylor coefficients in the function object are lost; i.e., $cref/f.size_order()/size_order/$$ after the optimization is zero. (See the discussion about efficiency above.) $head Speed Testing$$ You can run the CppAD $cref/speed/speed_main/$$ tests and see the corresponding changes in number of variables and execution time. Note that there is an interaction between using $cref/optimize/speed_main/Global Options/optimize/$$ and $cref/onetape/speed_main/Global Options/onetape/$$. If $icode onetape$$ is true and $icode optimize$$ is true, the optimized tape will be reused many times. If $icode onetape$$ is false and $icode optimize$$ is true, the tape will be re-optimized for each test. $head Atomic Functions$$ There are some subtitle issue with optimized $cref atomic$$ functions $latex v = g(u)$$: $subhead rev_sparse_jac$$ The $cref atomic_two_rev_sparse_jac$$ function is be used to determine which components of $icode u$$ affect the dependent variables of $icode f$$. For each atomic operation, the current $cref/atomic_sparsity/atomic_two_option/atomic_sparsity/$$ setting is used to determine if $code pack_sparsity_enum$$, $code bool_sparsity_enum$$, or $code set_sparsity_enum$$ is used to determine dependency relations between argument and result variables. $subhead nan$$ If $icode%u%[%i%]%$$ does not affect the value of the dependent variables for $icode f$$, the value of $icode%u%[%i%]%$$ is set to $cref nan$$. $head Checking Optimization$$ If $cref/NDEBUG/Faq/Speed/NDEBUG/$$ is not defined, and $cref/f.size_order()/size_order/$$ is greater than zero, a $cref forward_zero$$ calculation is done using the optimized version of $icode f$$ and the results are checked to see that they are the same as before. If they are not the same, the $cref ErrorHandler$$ is called with a known error message related to $icode%f%.optimize()%$$. $head exceed_collision_limit$$ If the return value $icode flag$$ is true (false), the previous call to $icode%f%.optimize%$$ exceed the $cref/collision_limit/optimize/options/collision_limit=value/$$. $head Examples$$ $comment childtable without Example instead of Contents for header$$ $children% example/optimize/optimize_twice.cpp %example/optimize/forward_active.cpp %example/optimize/reverse_active.cpp %example/optimize/compare_op.cpp %example/optimize/print_for.cpp %example/optimize/conditional_skip.cpp %example/optimize/nest_conditional.cpp %example/optimize/cumulative_sum.cpp %$$ $table $rref optimize_twice.cpp$$ $rref optimize_forward_active.cpp$$ $rref optimize_reverse_active.cpp$$ $rref optimize_compare_op.cpp$$ $rref optimize_print_for.cpp$$ $rref optimize_conditional_skip.cpp$$ $rref optimize_nest_conditional.cpp$$ $rref optimize_cumulative_sum.cpp$$ $tend $end ----------------------------------------------------------------------------- */ # include /*! \file optimize.hpp Optimize a player object operation sequence */ namespace CppAD { // BEGIN_CPPAD_NAMESPACE /*! Optimize a player object operation sequence The operation sequence for this object is replaced by one with fewer operations but the same funcition and derivative values. \tparam Base base type for the operator; i.e., this operation was recorded using AD and computations by this routine are done using type Base. \param options \li If the sub-string "no_conditional_skip" appears, conditional skip operations will not be generated. This may make the optimize routine use significantly less memory and take significantly less time. \li If the sub-string "no_compare_op" appears, then comparison operators will be removed from the optimized tape. These operators are necessary for the compare_change function to be be meaningful in the resulting recording. On the other hand, they are not necessary and take extra time when compare_change is not used. */ template void ADFun::optimize(const std::string& options) { # if CPPAD_CORE_OPTIMIZE_PRINT_RESULT // size of operation sequence before optimizatiton size_t size_op_before = size_op(); # endif // place to store the optimized version of the recording local::recorder rec; // number of independent variables size_t n = ind_taddr_.size(); # ifndef NDEBUG size_t i, j, m = dep_taddr_.size(); CppAD::vector x(n), y(m), check(m); Base max_taylor(0); bool check_zero_order = num_order_taylor_ > 0; if( check_zero_order ) { // zero order coefficients for independent vars for(j = 0; j < n; j++) { CPPAD_ASSERT_UNKNOWN( play_.GetOp(j+1) == local::InvOp ); CPPAD_ASSERT_UNKNOWN( ind_taddr_[j] == j+1 ); x[j] = taylor_[ ind_taddr_[j] * cap_order_taylor_ + 0]; } // zero order coefficients for dependent vars for(i = 0; i < m; i++) { CPPAD_ASSERT_UNKNOWN( dep_taddr_[i] < num_var_tape_ ); y[i] = taylor_[ dep_taddr_[i] * cap_order_taylor_ + 0]; } // maximum zero order coefficient not counting BeginOp at beginning // (which is correpsonds to uninitialized memory). for(i = 1; i < num_var_tape_; i++) { if( abs_geq(taylor_[i*cap_order_taylor_+0] , max_taylor) ) max_taylor = taylor_[i*cap_order_taylor_+0]; } } # endif // create the optimized recording size_t exceed = false; switch( play_.address_type() ) { case local::play::unsigned_short_enum: exceed = local::optimize::optimize_run( options, n, dep_taddr_, &play_, &rec ); break; case local::play::unsigned_int_enum: exceed = local::optimize::optimize_run( options, n, dep_taddr_, &play_, &rec ); break; case local::play::size_t_enum: exceed = local::optimize::optimize_run( options, n, dep_taddr_, &play_, &rec ); break; default: CPPAD_ASSERT_UNKNOWN(false); } exceed_collision_limit_ = exceed; // number of variables in the recording num_var_tape_ = rec.num_var_rec(); // now replace the recording play_.get_recording(rec, n); // set flag so this function knows it has been optimized has_been_optimized_ = true; // free memory allocated for sparse Jacobian calculation // (the results are no longer valid) for_jac_sparse_pack_.resize(0, 0); for_jac_sparse_set_.resize(0,0); // free old Taylor coefficient memory taylor_.clear(); num_order_taylor_ = 0; cap_order_taylor_ = 0; // resize and initilaize conditional skip vector // (must use player size because it now has the recoreder information) cskip_op_.resize( play_.num_op_rec() ); // resize subgraph_info_ subgraph_info_.resize( ind_taddr_.size(), // n_ind dep_taddr_.size(), // n_dep play_.num_op_rec(), // n_op play_.num_var_rec() // n_var ); # ifndef NDEBUG if( check_zero_order ) { std::stringstream s; // // zero order forward calculation using new operation sequence check = Forward(0, x, s); // check results Base eps99 = Base(99) * CppAD::numeric_limits::epsilon(); for(i = 0; i < m; i++) if( ! abs_geq( eps99 * max_taylor , check[i] - y[i] ) ) { std::string msg = "Error during check of f.optimize()."; msg += "\neps99 * max_taylor = " + to_string(eps99 * max_taylor); msg += "\ncheck[i] = " + to_string(check[i]); msg += "\ny[i] = " + to_string(y[i]); CPPAD_ASSERT_KNOWN( abs_geq( eps99 * max_taylor , check[i] - y[i] ) , msg.c_str() ); } // Erase memory that this calculation was done so NDEBUG gives // same final state for this object (from users perspective) num_order_taylor_ = 0; } # endif # if CPPAD_CORE_OPTIMIZE_PRINT_RESULT // size of operation sequence after optimizatiton size_t size_op_after = size_op(); std::cout << "optimize: size_op: before = " << size_op_before << ", after = " << size_op_after << "\n"; # endif } } // END_CPPAD_NAMESPACE # undef CPPAD_CORE_OPTIMIZE_PRINT_RESULT # endif