Skip to content

Commit

Permalink
fix bugs of data preprocessing with multiple json keys
Browse files Browse the repository at this point in the history
  • Loading branch information
junjzhang committed Dec 25, 2024
1 parent 2da43ef commit 96f966d
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions tools/preprocess_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@
import time
import gzip
import glob
import torch
import numpy as np
import multiprocessing
try:
import nltk
Expand Down Expand Up @@ -184,7 +182,8 @@ def process_json_file(self, file_name):
self.print_processing_stats(i, proc_start, total_bytes_processed)

fin.close()
builders[key].finalize(output_idx_files[key])
for key in builders.keys():
builders[key].finalize(output_idx_files[key])


def get_args():
Expand Down

0 comments on commit 96f966d

Please sign in to comment.