Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize import into from select performance #60377

Open
crazycs520 opened this issue Apr 2, 2025 · 2 comments
Open

optimize import into from select performance #60377

crazycs520 opened this issue Apr 2, 2025 · 2 comments
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@crazycs520
Copy link
Contributor

crazycs520 commented Apr 2, 2025

Enhancement

@crazycs520 crazycs520 added the type/enhancement The issue or PR belongs to an enhancement. label Apr 2, 2025
@D3Hunter
Copy link
Contributor

D3Hunter commented Apr 2, 2025

there is another direction to optimize: we can split the read from query and then send to chunk processor routine into 2 parts

  • one routine to read from query, chunk by chunk(not the chunk of import) then send to below routines:
  • one or more routines to dispatch to chunk processor of import

@crazycs520
Copy link
Contributor Author

crazycs520 commented Apr 2, 2025

there is another direction to optimize: we can split the read from query and then send to chunk processor routine into 2 parts

  • one routine to read from query, chunk by chunk(not the chunk of import) then send to below routines:
  • one or more routines to dispatch to chunk processor of import

I have try read and dispatch concurrently, and it is also ok, but it is a little bit slower than 610b588

same test, the TiDB log of read and dispatch concurrently

2025/04/02 21:20:36.114 +08:00] [INFO] [chunk_process.go:334] ["process chunk completed"] [table=t] [import-id=04edd30b-1e4c-4d33-b6e4-f76a8654ed04] [key=import-from-select] [readDur=22.404524656s] [encodeDur=4.569496274s] [checksum="{cksum=15080023649979568948,size=280847967,kvs=1245846}"] [deliverDur=797.464732ms] [type=query] [takeTime=27.658396133s] []

The log of 610b588

[2025/04/02 21:16:11.066 +08:00] [INFO] [chunk_process.go:334] ["process chunk completed"] [table=t] [import-id=8a1edb5e-51d3-41f4-9424-2ccfa6976e0a] [key=import-from-select] [readDur=20.401453275s] [encodeDur=7.015024856s] [checksum="{cksum=2927277725204806340,size=558542058,kvs=2477744}"] [deliverDur=1.626327185s] [type=query] [takeTime=28.630183974s] []

BTW, I'll try sending chunk instead of row in channel later, I think this is better, but the changes will be bigger, so I didn't try this modification method at first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

2 participants