This solution operates in O(n) time instead of O(n*log(n)) time, which
surprisingly isn't *that* big of a difference...
Consider a size of n of 10M...
1) ~10s
2) ~0.5s
So, yes, the O(n*log(n)) will take 100x longer to complete, but for an enormous
input size of 10M elements, it can still complete in under a minute. The
difference between that and the second, faster, algorithm, is just 9s.