Scaling

Feature scaling makes the optimization better conditioned and leads to faster and better convergence. So, in theory, when you are using normal equations, you don’t use numerical methods, like gradient descent. Therefore, feature scaling is not necessary.