In modern machine learning, optimization almost automatically implies gradients. Adam, Adafactor, Lion, and SGD are all variants of the same concept: moving parameters downhill along a cheaply computed gradient.
However, many important problems exist where the gradient is either absent, meaningless, or too expensive to approximate. For these black-box optimization scenarios, CMA-ES (Covariance Matrix Adaptation Evolution Strategy) is a remarkably effective tool.
CMA-ES is a Zero-Order optimizer. It operates without knowing the local slope of the landscape, relying instead on the height of the objective at various points. This approach provides a principled method for navigating toward optimal solutions using only function evaluations.