Designing Optimal Control Laws for Underactuated Mechanical Systems

Introduction to Underactuated Mechanical Systems

Underactuated mechanical systems are a fundamental class of systems in which the number of independent control inputs is strictly less than the number of degrees of freedom. This inherent limitation makes their control far more challenging than that of fully actuated systems, yet they appear pervasively across modern engineering. Examples include robotic manipulators with passive joints, aerial vehicles such as quadrotors and fixed-wing aircraft, surface and underwater vessels, flexible structures, and walking or running robots. The study of optimal control laws for these systems is not merely an academic exercise—it is essential for achieving high performance, safety, and efficiency in real-world applications ranging from autonomous vehicles to industrial automation.

Underactuation can arise from design choices (e.g., reducing actuator weight or cost), physical constraints (e.g., a ship cannot apply direct side force), or environmental interaction (e.g., a walking robot’s foot contact with the ground). In all cases, the controller must exploit the system’s dynamics—including gravity, inertia, and coupling—to maneuver effectively. Designing an optimal control law that minimizes a performance criterion while respecting constraints is a rich and active area of research.

Understanding Underactuated Mechanical Systems

Core Characteristics

An underactuated system is defined by its configuration space and control space: if the configuration vector has dimension n and the control vector has dimension m, underactuation means m < n. This mismatch implies that the system cannot directly command all acceleration directions. Instead, the controller must rely on dynamic coupling between actuated and unactuated degrees of freedom. For instance, a classic inverted pendulum on a cart is underactuated: the cart provides force (one control input), but the system has two degrees of freedom (cart position and pendulum angle). The pendulum’s motion is influenced only through the cart’s acceleration.

Prominent Examples

Cart-Pole System: A movable cart with a freely swinging pendulum. Used extensively in control theory education and as a benchmark for nonlinear control.
Rotary Inverted Pendulum (Furuta Pendulum): A pendulum attached to a rotating arm; the motor only controls the arm’s rotation, but the pendulum’s angle is unactuated.
Quadrotor UAV: Four propellers provide thrust and torque, yet a quadrotor has six degrees of freedom (position and orientation). It is underactuated because it cannot independently control all translational and rotational motions—it must tilt to generate horizontal acceleration.
Walking Robots (e.g., bipeds, quadruped): During stance phase, foot contact with the ground imposes constraints; the number of actuators is far less than the total joints, requiring careful balance and gait planning.
Autonomous Underwater Vehicles (AUVs): Often have fewer thrusters than degrees of freedom, making them underactuated, especially in hovering or maneuvering at low speed.
Flexible Structures: Large space structures or robotic arms with flexible links have infinitely many vibration modes but only a few actuators, demanding advanced control for damping and precision.

Why Underactuation Matters

Underactuated systems offer advantages in weight, cost, and energy efficiency. They also mimic biological locomotion—animals and humans are inherently underactuated, using passive dynamics and coordination to move elegantly. However, their control is fundamentally nonlinear and non-holonomic in many cases, meaning that achievable motions depend on path history. This property makes optimal control design both critical and demanding.

Goals of Optimal Control Design

Stability

The primary goal of any control system is to ensure that the closed-loop system remains stable. For underactuated systems, stability often involves regulating the system to an equilibrium point (e.g., balancing a pendulum or hovering a quadrotor) or along a desired trajectory. Lyapunov-based methods are commonly employed to guarantee asymptotic or exponential stability in the sense of Lyapunov. Because underactuated dynamics are nonlinear, linearization is only valid locally; global stabilization may require energy-shaping or passivity-based control.

Performance

Performance is typically quantified by an objective or cost function that captures energy consumption, time to reach a target, tracking error, control effort, or a combination thereof. Optimal control seeks to minimize (or maximize) this cost subject to the system dynamics and constraints. For underactuated systems, trade-offs are unavoidable: a fast swing-up of a pendulum consumes more energy and may cause large overshoot, whereas a slow swing-up might be smooth but inefficient. The designer must choose appropriate weighting in the cost function to align with mission requirements.

Robustness

Real-world systems are subject to parameter uncertainties (e.g., mass, inertia, friction), external disturbances (wind, waves, payload variation), and unmodeled dynamics (e.g., sensor noise, actuator saturation). An optimal control law must perform reliably under these conditions. Approaches such as robust control theory, model predictive control (MPC) with constraints, or adaptive control can enhance robustness. The cost function can also include robustness metrics, such as H-infinity norms, to explicitly trade performance for stability margins.

Additional Objectives

Constraint Satisfaction: Many underactuated systems operate within physical limits—actuator saturation, joint limits, obstacle avoidance, or contact forces. Optimal control must ensure constraints are respected, often via barrier functions or constrained optimization.
Trajectory Planning: In many applications (e.g., robotic manipulation, drone flight), the controller must generate feasible, near-optimal trajectories that satisfy differential constraints (nonholonomic dynamics).
Energy Efficiency: Particularly important for battery-powered or unmanned systems. Optimal control can reduce energy consumption by exploiting natural dynamics (e.g., swing-up of a pendulum using pumping motion).

Methods for Designing Optimal Control Laws

Optimal Control Theory

The foundation of optimal control for underactuated systems lies in variational principles. Two cornerstones are Pontryagin’s Minimum Principle (PMP) and Dynamic Programming (which leads to the Hamilton-Jacobi-Bellman (HJB) equation).

Pontryagin’s Minimum Principle: PMP provides necessary conditions for optimality. It introduces the Hamiltonian, adjoint variables (costates), and boundary conditions. For underactuated systems, PMP can be used to derive bang-bang or singular control arcs, often appearing in minimum-time or minimum-energy problems. Solving the resulting two-point boundary value problem (BVP) numerically yields the optimal open-loop control. While powerful, PMP typically does not handle state constraints easily and may require homotopy methods.
Dynamic Programming: The HJB equation offers a sufficient condition for global optimality via a value function. However, for high-dimensional underactuated systems, solving the HJB PDE is computationally intractable (curse of dimensionality). Approximate dynamic programming (ADP) and neuro-dynamic programming have been explored to scale these methods, often using neural networks to represent the value function.

Lyapunov-Based Control Design

Lyapunov’s direct method is a powerful tool for synthesizing stabilizing feedback control laws, even for underactuated systems. The idea is to construct a positive definite candidate Lyapunov function V(x) and then derive the control u so that its derivative along trajectories V̇(x) is negative definite (or semidefinite). For underactuated systems, the control may not appear directly in V̇ for all states, requiring energy-shaping or interconnection-and-damping assignment passivity-based control (IDA-PBC). In IDA-PBC, the target dynamics are chosen to be a port-Hamiltonian system with a desired energy function, and the control law is computed to match the system’s structure. This method has been successfully applied to mechanical systems like the Furuta pendulum and underwater vehicles.

Feedback Linearization

Feedback linearization seeks to transform a nonlinear system into a fully or partially linear one using a diffeomorphism and nonlinear state feedback. For underactuated systems, the system may be input-output linearizable but not fully linearizable (i.e., internal dynamics remain). The remaining dynamics, known as the zero dynamics, must be stable for the controller to work. By controlling the output (such as the pendulum angle relative to vertical for a cart-pole), the internal dynamics (cart motion) must be bounded. A common approach combines feedback linearization for the output with a linear controller (e.g., LQR) on the linearized subsystem, while analyzing the zero dynamics separately. Near the equilibrium, this technique can achieve excellent performance, but it may fail globally if zero dynamics become unstable.

Model Predictive Control (MPC)

Model predictive control solves a finite-horizon optimal control problem online, repeatedly, using a model of the system. For underactuated systems, nonlinear MPC (NMPC) is especially attractive because it can handle constraints, nonlinear dynamics, and multiple objectives simultaneously. The controller predicts future states over a horizon N and optimizes control inputs while respecting state and input constraints. At each sampling instant, only the first control move is applied, and the optimization is solved again with updated measurements. Advances in computational power and efficient solvers (e.g., sequential quadratic programming, real-time iteration schemes) have made NMPC feasible for systems like quadrotors, walking robots, and autonomous vehicles. The cost function in MPC can encode tracking error, energy consumption, and comfort. However, the need for accurate dynamics models and the computational burden remain challenges, especially for fast dynamics.

Energy-Based Control

Many underactuated mechanical systems, particularly those with passive degrees of freedom, can be controlled by shaping their total energy. The classic example is the swing-up of a pendulum: by pumping energy into the system (changing the pivot point or applying torque) at the right phase, the pendulum gains enough kinetic energy to reach the inverted position. Energy-based controllers often rely on passivity, exploiting the fact that the system’s natural dynamics are energy-dissipative. A Lyapunov function can be constructed from the difference between the actual and desired energy plus a damping term. Such controllers are typically robust and do not require full state trajectory planning—they simply drive the energy to a reference value.

Sliding Mode Control (SMC) and Variable Structure

SMC is a robust nonlinear method that forces the system to slide along a designed surface in state space. It is effective for underactuated systems with matched uncertainties, as it can handle nonlinearities and disturbances. However, chattering due to discontinuous switching can degrade performance. Modifications like higher-order SMC or boundary layer smoothing mitigate this. SMC can be combined with optimal control principles by designing the sliding surface to minimize an integral cost (e.g., sliding mode with LQR-optimal sliding surface). This is particularly useful for systems like robotic manipulators with passive joints.

Reinforcement Learning (RL) for Optimal Control

In recent years, reinforcement learning (especially deep RL) has emerged as a complementary approach to traditional optimal control for underactuated systems. Algorithms such as DDPG (Deep Deterministic Policy Gradient), PPO, and SAC learn policies (control laws) directly from interactions with the environment or from simulations. RL can handle complex nonlinear dynamics and unknown models, making it appealing for challenging underactuated tasks like acrobatic maneuvers, bipedal walking, or drone racing. However, sample efficiency, safety guarantees, and stability proofs remain open challenges. Many researchers combine RL with model-based optimal control (e.g., MPC-guided policy search) to benefit from both data-driven and model-based strengths.

Mathematical Formulation of the Optimal Control Problem

A typical optimal control problem for an underactuated mechanical system is formulated as follows. The system dynamics are given by the second-order differential equation:

M(q) q̈ + C(q, q̇) q̇ + G(q) = B(q) u

where q ∈ ℝⁿ are generalized coordinates, M(q) is the inertia matrix, C(q, q̇) contains Coriolis and centrifugal terms, G(q) represents gravitational forces, u ∈ ℝᵐ is the control input, and B(q) is the input mapping matrix. Underactuation implies that rank(B) = m < n. The optimal control seeks to minimize:

J = φ(q(T), q̇(T)) + ∫₀ᵀ L(q(t), q̇(t), u(t)) dt

subject to state constraints (e.g., joint limits, velocity bounds) and input constraints (e.g., actuator saturation). The terminal cost φ and running cost L are chosen by the designer. Solving this problem directly is generally intractable analytically for nonlinear underactuated systems, hence the reliance on numerical methods or approximate solutions via the techniques described above.

Challenges in Designing Optimal Control Laws for Underactuated Systems

Nonlinearity and Nonholonomy

Underactuated systems are inherently nonlinear. Their dynamics often exhibit complex behaviors such as bifurcations, chaotic motions, and singularities. Many are also nonholonomic, meaning that the system’s achievable velocities at any configuration are constrained to a subspace, but constraints are not integrable (e.g., a rolling wheel or a snake robot). This complicates trajectory planning: the control cannot simply position the system directly; it must use paths and differential constraints. Optimal control must explicitly account for these nonlinear constraints, often requiring multiple shooting or direct collocation methods.

Input and State Constraints

Physical limitations such as motor torque limits, joint angle limits, and obstacle avoidance impose inequalities on both states and inputs. Enforcing these within an optimal control framework increases problem complexity. For example, a quadrotor cannot exceed its maximum thrust; a walking robot must keep its foot contact force within friction cones. Feasibility of the optimal solution must be guaranteed, which may require constraint softening or barrier functions.

Model Uncertainty and Disturbances

Accurate models of underactuated systems are often difficult to obtain. Friction, flexibility, actuator dynamics, and environmental interactions (e.g., air resistance, ground contacts) introduce uncertainties. Optimal control laws designed for an idealized model may perform poorly in reality. Robust optimal control or adaptive optimal control (e.g., using system identification paired with MPC) is therefore an active research area. The trade-off between robustness and optimality is particularly sharp.

Computational Complexity

Real-time optimal control for underactuated systems demands fast solvers. Nonlinear MPC requires solving a constrained optimization problem at every sampling instant; for high-dimensional systems (e.g., humanoid robots with dozens of joints), this can be computationally prohibitive. Model reduction, explicit MPC, and offline optimization (e.g., motion primitives) are common workarounds. The advent of GPU acceleration and specialized hardware (e.g., FPGA-based solvers) is pushing the boundaries, but complexity remains a major hurdle.

Global Optimum vs. Local Optima

The cost landscape for underactuated systems is often nonconvex, riddled with local minima. Gradient-based optimization algorithms (e.g., direct collocation, single shooting) may converge to suboptimal solutions unless initial guesses are excellent. Global optimization methods (e.g., particle swarm, simulated annealing) exist but are too slow for real-time. Hybrid approaches that combine sampling-based planning (RRT, PRM) with local optimization (e.g., CHOMP, TrajOpt) are popular in motion planning for underactuated robots.

Case Studies and Applications

Quadrotor Trajectory Tracking

Quadrotors are a classic testbed for underactuated optimal control. A typical optimal controller minimizes jerk or snap (fourth derivative of position) to generate smooth, energy-efficient trajectories while respecting thrust limits. Nonlinear MPC is widely used: the quadrotor’s state (position, velocity, orientation, angular velocity) is predicted over a short horizon (0.5–2 seconds) while optimizing for tracking error and input effort. The underactuation is overcome by allowing the thrust vector to tilt, introducing coupling between translation and rotation. Recent work has incorporated obstacle avoidance and wind disturbance rejection (e.g., using robust tube-based MPC).

Bipedal Walking

Walking robots like the bipedal Cassie or Asimo are highly underactuated during the single support phase (only one foot in contact, ankle torques limited). Optimal control is used to plan and stabilize gait cycles. The problem is often formulated as a hybrid system (continuous dynamics + discrete foot impacts). Methods include direct transcription of the optimal control problem (e.g., using FROST or OCS2) combined with model predictive control for online adaptation. Energy-optimal gaits that minimize cost of transport (COT) are a major goal, achieved by exploiting passive dynamics through spring-mass models and optimizing ground reaction forces.

Autonomous Underwater Vehicles (AUVs)

Many AUVs use only a few thrusters (e.g., two rear and two vertical) to navigate six degrees of freedom. Optimal control laws for trajectory tracking or hovering must account for hydrodynamic drag, currents, and coupling between axes (e.g., pitch affects forward speed). Lyapunov-based methods and MPC have both been applied. The challenge of underactuation is acute at low speeds (where fins lose effectiveness), requiring careful design of oscillatory motion or using steady angles.

Future Directions and Research Trends

The field of optimal control for underactuated mechanical systems is evolving rapidly. Several key trends are shaping its future:

Learning-Based Control: Deep reinforcement learning and imitation learning are being integrated with model-based optimal control. Hybrid approaches (e.g., learning residual dynamics or cost functions) promise to combine the data efficiency of MPC with the adaptability of RL. Efforts to guarantee stability and safety via Lyapunov functions in the learning loop (e.g., neural Lyapunov control) are gaining traction.
Safety-Critical Control: Real-world deployment demands formal guarantees. Control barrier functions (CBFs) and control Lyapunov functions (CLFs) provide provable safety and stability, respectively. Combining CBF-CLF-based quadratic programs (QPs) with optimal control yields a framework that trades performance for safety in a computationally efficient manner.
Cooperative and Multi-Agent Systems: Underactuated systems often operate in swarms (e.g., drone formations, robot teams). Distributed optimal control, where each agent solves a local optimization problem while communicating with neighbors, is an active area. Underactuation adds complexity because each agent must coordinate its motion constraints.
Reduced-Order and Learned Models: Model reduction techniques (e.g., Proper Orthogonal Decomposition, Dynamic Mode Decomposition) and learned models (e.g., neural ordinary differential equations) enable real-time optimal control of high-dimensional underactuated systems by compressing the dynamics to a lower-dimensional latent space.
Differentiable Simulation: Tools like MuJoCo, PyTorch-based differentiable physics engines, and CasADi allow for end-to-end optimization of control laws using gradients through the dynamics. This facilitates simultaneous optimization of both trajectory and high-level parameters (e.g., morphology or cost weights).

Conclusion

Designing optimal control laws for underactuated mechanical systems is a challenging but immensely rewarding endeavor. The fundamental asymmetry between control inputs and degrees of freedom forces engineers to deeply understand the dynamics, harness passive phenomena, and employ sophisticated mathematical tools. From classic methods like Pontryagin’s principle and Lyapunov design to modern approaches such as nonlinear MPC and reinforcement learning, the toolkit continues to expand. While formidable challenges remain—especially regarding computational real-time implementation and robustness—the rapid advances in algorithms, hardware, and interdisciplinary research are pushing the boundaries of what is possible. These techniques are not only enabling extraordinary new capabilities in robotics, aerospace, and marine engineering but also enriching our fundamental understanding of control and dynamical systems.

For readers interested in a deeper dive, excellent resources include the classic textbook Underactuated Robotics by Russ Tedrake (MIT OpenCourseWare) and Nonlinear Systems by H. K. Khalil. For optimal control theory, EE363 (Stanford) provides lecture notes on linear and nonlinear optimization. For practitioners, the Underactuated Robotics Python library offers hands-on implementations of many control laws discussed here.