Skip to content

Commit b6ee5c5

Browse files
sattvikctamassolteszanku255
authored
feat: multithreaded bulk import (#1077) (#1079)
* feat: Add BulkImport APIs and cron * chore: update pull request template * fix: Use the correct tenant config to create the proxy storage * fix: PR changes * fix: PR changes * fix: PR changes * fix: PR changes * fix: PR changes * fix: PR changes * fix: Update version * fix: PR changes * fix: PR changes * fix: Rename DeleteBulkImportUser API path * fix: disable bulk import for in-memory db * fix: a bug with createTotpDevices * fix: PR changes * feat: Add an api to import user in sync * feat: Add an api to get count of bulk import users * fix: PR changes * fix: Add error codes and plainTextPassword import * fix: PR changes * feat: multithreaded bulk import * fix: changelog update * fix: add new test * fix: fixing unreliable mutithreaded bulk import with mysql * fix: review fixes * fix: fixing failing tests * feat: bulkimport flow tests * feat: bulk import cron starter api * fix: tweaking params for faster import * fix: tests * checkpoint * fix: remove vacuuming * fix: minor tweaks * feat: bulk inserting the bulk migration data * fix: fast as a lightning * fix: restoring lost method * fix: reworked error handling to comform previous approach with messages * fix: fixing tests * fix: fixing failing tests, changing version * chore: update changelog * fix: fixing issues and failing tests * fix: review changes * fix: review fixes, reworking cron start/stop --------- Co-authored-by: Tamas Soltesz <[email protected]> Co-authored-by: Ankit Tiwari <[email protected]>
1 parent 038b5c8 commit b6ee5c5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+7747
-104
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ highlighting the necessary changes)
3737
- If no such branch exists, then create one from the latest released branch.
3838
- [ ] If added a foreign key constraint on `app_id_to_user_id` table, make sure to delete from this table when deleting
3939
the user as well if `deleteUserIdMappingToo` is false.
40+
- [ ] If added a new recipe, then make sure to update the bulk import API to include the new recipe.
4041

4142
## Remaining TODOs for this PR
4243

CHANGELOG.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,46 @@ to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

88
## [Unreleased]
99

10+
## [9.4.0]
11+
12+
### Added
13+
- Adds property `bulk_migration_parallelism` for fine-tuning the worker threads number
14+
- Adds APIs to bulk import users
15+
- GET `/bulk-import/users`
16+
- POST `/bulk-import/users`
17+
- GET `/bulk-import/users/count`
18+
- POST `/bulk-import/users/remove`
19+
- POST `/bulk-import/users/import`
20+
- POST `/bulk-import/backgroundjob`
21+
- GET `/bulk-import/backgroundjob`
22+
- Adds `ProcessBulkImportUsers` cron job to process bulk import users
23+
- Adds multithreaded worker support for the `ProcessBulkImportUsers` cron job for faster bulk imports
24+
- Adds support for lazy importing users
25+
26+
### Migrations
27+
28+
```sql
29+
"CREATE TABLE IF NOT EXISTS bulk_import_users (
30+
id CHAR(36),
31+
app_id VARCHAR(64) NOT NULL DEFAULT 'public',
32+
primary_user_id VARCHAR(36),
33+
raw_data TEXT NOT NULL,
34+
status VARCHAR(128) DEFAULT 'NEW',
35+
error_msg TEXT,
36+
created_at BIGINT NOT NULL,
37+
updated_at BIGINT NOT NULL,
38+
CONSTRAINT bulk_import_users_pkey PRIMARY KEY(app_id, id),
39+
CONSTRAINT bulk_import_users__app_id_fkey FOREIGN KEY(app_id) REFERENCES apps(app_id) ON DELETE CASCADE
40+
);
41+
42+
CREATE INDEX IF NOT EXISTS bulk_import_users_status_updated_at_index ON bulk_import_users (app_id, status, updated_at);
43+
44+
CREATE INDEX IF NOT EXISTS bulk_import_users_pagination_index1 ON bulk_import_users (app_id, status, created_at DESC,
45+
id DESC);
46+
47+
CREATE INDEX IF NOT EXISTS bulk_import_users_pagination_index2 ON bulk_import_users (app_id, created_at DESC, id DESC);
48+
```
49+
1050
## [9.3.1]
1151
1252
- Includes exception class name in 500 error message

build.gradle

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,7 @@ compileTestJava { options.encoding = "UTF-8" }
1919
// }
2020
//}
2121

22-
version = "9.3.1"
23-
22+
version = "9.4.0"
2423

2524
repositories {
2625
mavenCentral()

config.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,3 +170,7 @@ core_config_version: 0
170170

171171
# (Optional | Default: null) string value. The encryption key used for saving OAuth client secret on the database.
172172
# oauth_client_secret_encryption_key:
173+
174+
# (DIFFERENT_ACROSS_APPS | OPTIONAL | Default: number of available processor cores) int value. If specified,
175+
# the supertokens core will use the specified number of threads to complete the migration of users.
176+
# bulk_migration_parallelism:

devConfig.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,3 +170,8 @@ disable_telemetry: true
170170

171171
# (Optional | Default: null) string value. The encryption key used for saving OAuth client secret on the database.
172172
# oauth_client_secret_encryption_key:
173+
174+
# (DIFFERENT_ACROSS_APPS | OPTIONAL | Default: number of available processor cores) int value. If specified,
175+
# the supertokens core will use the specified number of threads to complete the migration of users.
176+
# bulk_migration_parallelism:
177+

src/main/java/io/supertokens/Main.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
import io.supertokens.config.Config;
2121
import io.supertokens.config.CoreConfig;
2222
import io.supertokens.cronjobs.Cronjobs;
23+
import io.supertokens.cronjobs.bulkimport.ProcessBulkImportUsers;
2324
import io.supertokens.cronjobs.cleanupOAuthSessionsAndChallenges.CleanupOAuthSessionsAndChallenges;
2425
import io.supertokens.cronjobs.deleteExpiredAccessTokenSigningKeys.DeleteExpiredAccessTokenSigningKeys;
2526
import io.supertokens.cronjobs.deleteExpiredDashboardSessions.DeleteExpiredDashboardSessions;
@@ -61,6 +62,8 @@ public class Main {
6162

6263
// this is a special variable that will be set to true by TestingProcessManager
6364
public static boolean isTesting = false;
65+
// this flag is used in ProcessBulkImportUsersCronJobTest to skip the user validation
66+
public static boolean isTesting_skipBulkImportUserValidationInCronJob = false;
6467

6568
// this is a special variable that will be set to true by TestingProcessManager
6669
public static boolean makeConsolePrintSilent = false;
@@ -257,6 +260,9 @@ private void init() throws IOException, StorageQueryException {
257260
// starts DeleteExpiredAccessTokenSigningKeys cronjob if the access token signing keys can change
258261
Cronjobs.addCronjob(this, DeleteExpiredAccessTokenSigningKeys.init(this, uniqueUserPoolIdsTenants));
259262

263+
// initializes ProcessBulkImportUsers cronjob to process bulk import users
264+
Cronjobs.addCronjob(this, ProcessBulkImportUsers.init(this, uniqueUserPoolIdsTenants));
265+
260266
Cronjobs.addCronjob(this, CleanupOAuthSessionsAndChallenges.init(this, uniqueUserPoolIdsTenants));
261267

262268
// this is to ensure tenantInfos are in sync for the new cron job as well
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
/*
2+
* Copyright (c) 2024, VRAI Labs and/or its affiliates. All rights reserved.
3+
*
4+
* This software is licensed under the Apache License, Version 2.0 (the
5+
* "License") as published by the Apache Software Foundation.
6+
*
7+
* You may not use this file except in compliance with the License. You may
8+
* obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
12+
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
13+
* License for the specific language governing permissions and limitations
14+
* under the License.
15+
*/
16+
17+
package io.supertokens;
18+
19+
import io.supertokens.pluginInterface.Storage;
20+
import io.supertokens.pluginInterface.useridmapping.UserIdMapping;
21+
22+
public class StorageAndUserIdMappingForBulkImport extends StorageAndUserIdMapping {
23+
24+
public String userIdInQuestion;
25+
26+
public StorageAndUserIdMappingForBulkImport(Storage storage,
27+
UserIdMapping userIdMapping, String userIdInQuestion) {
28+
super(storage, userIdMapping);
29+
this.userIdInQuestion = userIdInQuestion;
30+
}
31+
}

0 commit comments

Comments
 (0)