Could it make sense to rebalance the base models for better score distribution? Everything on my top list seems to be falling between 3.5 and 5.