No TL;DR found
Note that since both of these methods are point estimates (they yield a value rather than a distribution), neither of them are completely Bayesian. A faithful Bayesian would use a model that yields a posterior distribution over all possible values of θ, but this is often intractable or very computationally expensive. Now suppose we have a coin with unknown bias θ. We are trying to find the bias of the coin by maximizing the underlying distribution. You tossed the coin n = 10 times and 3 of the tosses came as heads.