Skip to content

feat(bigquery/v2): query client based on bigquery/v2 package #12512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

alvarowolfx
Copy link
Contributor

@alvarowolfx alvarowolfx commented Jun 27, 2025

Draft on new query experience using the new bigquery/v2 client. Will break it down into smaller PR down the line, just testing the interface across languages to settle on a good interface.

@alvarowolfx alvarowolfx requested a review from shollyman June 27, 2025 18:17
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Jun 27, 2025

// Client is a client for running queries in BigQuery.
type Client struct {
c *apiv2_client.Client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just JobClient so you can choose a single client or the monoclient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when using the Storage integration, the table service is needed. It's used for fetching table schema, which in some cases is not cached, like if the user uses the AttachJob method

rc *storagepb.BigQueryReadClient
projectID string
billingProjectID string
defaultJobCreationMode bigquerypb.QueryRequest_JobCreationMode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To consider: defaultPostRequest rather than a single field, and use proto merge semantics to pick up defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, still need to work on this one.

}

// state is one of a sequence of states that a Job progresses through as it is processed.
type state = string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wish we had a v2 enum to use for this. Since we don't, do we need it at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed that enum, is not needed

Done state = "DONE"
)

func (q *Query) checkStatus(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fn name confuses me a bit. The implementation is a bit wonky since it explicitly asks for no rows but appears to have some row cache logic embedded via consumeQueryResponse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to reuse the consumeQueryResponse method on the newQueryJobFromQueryResponse method.

}

// JobReference returns the job reference.
func (q *Query) JobReference() *bigquerypb.JobReference {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't JobReference just a member of the object? What about stateless queries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I can make it a member, but what happens if uses fiddle with it and set it to nil somehow ? And for stateless queries, the job reference is build using the QueryID. I need to double check if is working as intended in that part, like trying to call AttachJob to a stateless query.

"cloud.google.com/go/bigquery/v2/apiv2/bigquerypb"
)

// Reader is used to read the results of a query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a Query, a Reader, and a RowIterator? This feels like an overlap at a high level, but we can dig in when we talk next.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just Query and RowIterators are going to be public, pushed some changes to outline that. Reader is an interface for reading data using jobs.query or using Storage Read API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants