Finetuned rwkv-v7-0.4B model with LoRA and DiSHA, but neither of these two methods load successfully after merging. It came out with KeyError: blocks.24.att.x_r. Is the model too small in architecture?