Consider a real-valued, differentiable function f(x) defined on R (or more generally on R^n).
You have access to an oracle that, for any input x, can return both the function value f(x) and its derivative (for n = 1) or gradient (for n > 1), denoted by ∇f(x).
Answer the following:
-
Describe an iterative algorithm to find a (possibly local) maximum of f starting from an initial point x_0.
-
Explain how the sign and magnitude of the derivative or gradient guide the direction of your updates.
-
Discuss how you would choose the step size (learning rate) for each iteration.
-
Describe reasonable stopping criteria and possible issues such as converging to local rather than global maxima or overshooting.
Focus on the high-level algorithm and reasoning; you do not need to provide code.