For this larger run, we should still keep the JSON scans... but only that and not the CSVs.
Also as soon as the data is satisfying enough we can push a ScanCode update to CD and see how we can trigger a rescan on the 5k popular packages with the new ScanCode options.