Skip to content

Commit e54eb98

Browse files
authored
covid_hosp update merging speedup: n^2-->n (#1111)
1 parent e38f4da commit e54eb98

File tree

1 file changed

+2
-1
lines changed
  • src/acquisition/covid_hosp/common

1 file changed

+2
-1
lines changed

src/acquisition/covid_hosp/common/utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,8 @@ def merge_by_key_cols(dfs, key_cols, logger=False):
160160
## repeated concatenation in pandas is expensive, but (1) we don't expect
161161
## batch sizes to be terribly large (7 files max) and (2) this way we can
162162
## more easily capture the next iteration's updates to any new keys
163-
new_rows = df.loc[[i for i in df.index.to_list() if i not in result.index.to_list()]]
163+
result_index_set = set(result.index.to_list())
164+
new_rows = df.loc[[i for i in df.index.to_list() if i not in result_index_set]]
164165
result = pd.concat([result, new_rows])
165166

166167
# convert the index rows back to columns

0 commit comments

Comments
 (0)