Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIDB Setup and Connection #232

Closed
wants to merge 17 commits into from
Closed

TIDB Setup and Connection #232

wants to merge 17 commits into from

Conversation

amlatyrngom
Copy link
Contributor

Add TIDB setup and connection.

Left todo: Make running the benchmark more convenient. You currently have to use the mysql connection object.

@amlatyrngom amlatyrngom requested a review from geoffxy July 31, 2023 13:36
Copy link
Member

@geoffxy geoffxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you also try out the MySQL ODBC driver to connect to TiDB? The current workload runners we have (workloads/IMDB_extended/...) rely on ODBC to connect.

@amlatyrngom
Copy link
Contributor Author

The full benchmarking code is here now. I've also updated some of the previous code to make it easier to run with TiDB.

Copy link
Member

@geoffxy geoffxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are some new merge conflicts - please fix before merging.

Regarding the Aurora/Redshift baselines we were discussing: please see if you can set up tooling to set up the clusters and retrieve pricing information. I think we can just set up the existing benchmark runner to manually hit Aurora and Redshift directly using the existing ODBC connections.

@@ -0,0 +1,266 @@
import argparse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please incorporate the TiDB changes into run_repeating_analytics.py instead of having this separate file.

run_tidb.py Outdated
@@ -0,0 +1,94 @@
# See workloads/cross_db_benchmark/benchmark_tools/tidb/README.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this file into workloads/IMDB_extended.

@@ -155,6 +156,9 @@ def test_extract_base_cardinality():
assert cards[0].width == 4


@pytest.mark.skip(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now (there was a failing test committed to main when you had created these commits).

@@ -0,0 +1,40 @@
# TIDB Comparison
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this TiDB documentation into workloads/IMDB_extended (can make a subdir there if it makes sense).

The cross_db_benchmark code is mostly used for data collection for the GNN run time model - most of the benchmarking code there is not needed/used so we don't need to add TiDB to the DatabaseConnection.

@@ -0,0 +1,6 @@
host: fillme
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also move this file into workloads/IMDB_extended? (We can have a separate config subdir there if it makes sense too). We should keep this config directory for BRAD related configs.

@amlatyrngom amlatyrngom closed this Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants