Reward maximization term is used in reinforcement learning, which is a goal of the reinforcement learning agent. In RL, a reward is positive feedback by taking action for a transition from one state to another. If the agent performs a good action by applying optimal policies, he gets a reward, and if he performs a bad action, one reward is subtracted. The goal of the agent is to maximize these rewards by applying optimal policies, which is termed reward maximization.
The RL agent basically works on a hypothesis of reward maximization. That’s why reinforcement learning should have the best possible action in order to maximize the reward.