-
Notifications
You must be signed in to change notification settings - Fork 156
Feat[MQB]: Add Authenticator and InitialConnectionContext #1004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d2c4c40 to
ea2450d
Compare
dorjesinpo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interim review.
we have pairs of the following:
decodeInitialConnectionMessage,
scheduleRead,
readCallback
readBlob,
processBlob,
complete
... in both InitialConnectionHandler and InitialConnectionContext.
The point of FSM-like state management is that everything happens in handleEvent and there is only one place where we call complete
Otherwise, we can't rely on controlling actions by the state
Is it possible to engage handleEvent immediately in InitialConnectionHandler::handleInitialConnection and call InitialConnectionContext::complete from handleEvent only.
We seem to have InitialConnectionContext::handleInitialConnection but we call InitialConnectionHandler::handleInitialConnection instead which does not seem to support auth?
| context)); | ||
|
|
||
| // Register as observer of the channel to get the 'onClose' | ||
| bsl::weak_ptr<InitialConnectionContext> context_wp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation behind the use of weak_ptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we have InitialConnectionContext bind to onClose, which means it won't be destroyed until the channel closes. It's works but I was hoping to clean it up once the initial connection part finishes, to make the lifetime match with the semantic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there is d_initialConnectionContextCache now, maybe we do not need to bind bsl::shared_ptr<InitialConnectionContext> anymore?
| bslmt::LockGuard<bslmt::Mutex> guard(&d_mutex); // LOCKED | ||
|
|
||
| if (d_state != AuthenticationState::e_AUTHENTICATING) { | ||
| errorDescription << "State not AUTHENTICATING (was " << d_state << ")"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was -> is
a009820 to
e4183da
Compare
emelialei88
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Response
| context)); | ||
|
|
||
| // Register as observer of the channel to get the 'onClose' | ||
| bsl::weak_ptr<InitialConnectionContext> context_wp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we have InitialConnectionContext bind to onClose, which means it won't be destroyed until the channel closes. It's works but I was hoping to clean it up once the initial connection part finishes, to make the lifetime match with the semantic.
dorjesinpo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interim review
| // BDE | ||
| #include <bdlbb_pooledblobbufferfactory.h> | ||
| #include <bsl_iostream.h> | ||
| #include <bsl_memory.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just double checking if this is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indirectly included in mqbmock_cluster.h but maybe we shouldn't depend on indirect includes. Needed for shared_ptr.
| d_scheduler_p, | ||
| d_allocators.get("Authenticator")), | ||
| d_allocator_p); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not creating SessionNegotiator below in the same way (vs. using raw pointer and then bslma::ManagedPtr)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setAdminCommandEnqueueCallback is a function for SessionNegotiator, but not exposed for mqbnet::Negotiator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very minor (since we do not throw exceptions) but what I meant is
bslma::ManagedPtr<mqbnet::Negotiator> negotiatorMp(
new (*d_allocator_p)
SessionNegotiator(&d_bufferFactory,
d_dispatcher_mp.get(),
d_statController_mp->clientsStatContext(),
&d_blobSpPool,
d_scheduler_p,
d_allocators.get("SessionNegotiator"),
d_allocator_p);
negotiatorMp
->setAdminCommandEnqueueCallback(
bdlf::BindUtil::bind(&Application::enqueueCommand,
this,
bdlf::PlaceHolders::_1, // source
bdlf::PlaceHolders::_2, // cmd
bdlf::PlaceHolders::_3, // onProcessedCb
bdlf::PlaceHolders::_4)); // fromReroute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I understand. But mqbnet::Negotiator doesn't expose an interface setAdminCommandEnqueueCallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not a problem (we could do bslma::ManagedPtr<mqbnet::Negotiator> negotiatorMp(impl); where impl is bslma::ManagedPtr<mqbnet::Negotiator>). The problem is setClusterCatalog below.
This is ok since we do not throw exceptions, do early returns
| } | ||
|
|
||
| void InitialConnectionContext::handleEvent( | ||
| int statusCode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the meaning of statusCode is not clear. It seems that all errors are communicated as InitialConnectionEvent::e_ERROR and TCPSessionFactory::initialConnectionComplete just needs statusCode != 0 (which is not enforced by InitialConnectionEvent::e_ERROR case).
Unless we want to communicate exact value to bmqio::Channel::close for some reason, the code can be simplified by removing statusCode argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the errors that occur during initial connection don’t print the error code or logs until initialConnectionComplete is called. If we were to follow this convention, it still feels necessary to pass the error code through handleEvent to initialConnectionComplete so it can be properly logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. But d_state is primary and statusCode is secondary. On the line 710, there is no check for statusCode, if it is 0, there will be no complete call.
Maybe, we should rely on InitialConnectionState::e_FAILED and InitialConnectionState::e_NEGOTIATED being the only final states and call complete as soon as we reach them. Assuming, any other state is transitory and awaits for more event(s).
| context)); | ||
|
|
||
| // Register as observer of the channel to get the 'onClose' | ||
| bsl::weak_ptr<InitialConnectionContext> context_wp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there is d_initialConnectionContextCache now, maybe we do not need to bind bsl::shared_ptr<InitialConnectionContext> anymore?
| } // close mutex lock guard // UNLOCK | ||
|
|
||
| // Keep track of active channels, for logging purposes | ||
| ++d_nbActiveChannels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moving ++d_nbActiveChannels changes the semantics of it. It used to track all channels, including incomplete ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be the race condition we discussed earlier, though I don’t recall the exact scenario. All the tests are passing now — possibly because the fuzz tests have sped up and no longer reproduce the issue. Definitely a good reminder for me to document these cases better next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall having any problems with d_nbActiveChannels, we do not seem to check it, only log
a0c0411 to
9d7a99a
Compare
9d7a99a to
3d2c808
Compare
chrisbeard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave the authenticator interfaces a first pass, looking good overall, some comments about the implementation details and minor design bits.
| /// The authentication message received. | ||
| bmqp_ctrlmsg::AuthenticationMessage d_authenticationMessage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation for storing this for the lifetime of the connection? We should not store this material once authentication completes. Can that be avoided?
| /// The encoding type used for sending a message. It should match with the | ||
| /// encoding type of the received message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this comment feels like it can be made clearer, perhaps something like:
// The encoding type used for sending authentication responses. The value matches
// the encoding type set by the client during authentication and ensures that all
// responses are sent using the same encoding.
| const bsl::shared_ptr<mqbplug::AuthenticationResult>& value); | ||
|
|
||
| // NOTE: AuthenticationMessage and encodingType are set only when | ||
| // AuthenticationState is e_AUTHENTICATED. authenticationMessage() and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the state mentioned in this comment correct? e_AUTHENTICATED feels wrong here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two functions are called only during reauthentication. The state should be e_AUTHENTICATED at that time.
| /// Produce and send outbound authentication message with the specified | ||
| /// `context`. Return 0 on success, or a non-zero error code and populate | ||
| /// the specified `errorDescription` with a description of the error | ||
| /// otherwise. | ||
| virtual int authenticationOutbound( | ||
| bsl::ostream& errorDescription, | ||
| const bsl::shared_ptr<AuthenticationContext>& context) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method feels a little out of place. Maybe that means design-wise we should rethink this, leave it out for now, or documentation on the method and interface should better define its purpose.
| const int k_MIN_THREADS = 1; // Minimum number of threads in the thread pool | ||
| const int k_MAX_THREADS = 3; // Maximum number of threads in the thread pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be configuration-driven.
| context->setAuthenticationMessage(authenticationMessage); | ||
| context->setAuthenticationEncodingType( | ||
| event.authenticationEventEncodingType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here, we should prefer to not store the authentication message for the lifetime of the connection.
| int rc = rc_SUCCESS; | ||
| bsl::string error; | ||
|
|
||
| // Setup error handler based on whether this is initial authn or reauthn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Setup error handler based on whether this is initial authn or reauthn | |
| // Set up error handler based on whether this is initial authn or reauthn |
| bsl::optional<bdlb::ScopeExitAny> scopeGuard; | ||
|
|
||
| if (isReauthn) { | ||
| // For reauthentication: setup error guard to handle failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // For reauthentication: setup error guard to handle failures | |
| // For reauthentication: set up error guard to handle failures |
| channel)); | ||
| } | ||
| else { | ||
| // For initial authentication: setup state machine transition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // For initial authentication: setup state machine transition | |
| // For initial authentication: set up state machine transition |
| } | ||
|
|
||
| bmqu::MemOutStream errorStream; | ||
| errorStream << "Authentication timeout after " << lifetime << " ms"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error could be made clearer. If I understand correctly, it's trying to convey that the client did not reauthenticate within before the lifetime expired?
dorjesinpo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments, mostly minor
| d_scheduler_p, | ||
| d_allocators.get("Authenticator")), | ||
| d_allocator_p); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very minor (since we do not throw exceptions) but what I meant is
bslma::ManagedPtr<mqbnet::Negotiator> negotiatorMp(
new (*d_allocator_p)
SessionNegotiator(&d_bufferFactory,
d_dispatcher_mp.get(),
d_statController_mp->clientsStatContext(),
&d_blobSpPool,
d_scheduler_p,
d_allocators.get("SessionNegotiator"),
d_allocator_p);
negotiatorMp
->setAdminCommandEnqueueCallback(
bdlf::BindUtil::bind(&Application::enqueueCommand,
this,
bdlf::PlaceHolders::_1, // source
bdlf::PlaceHolders::_2, // cmd
bdlf::PlaceHolders::_3, // onProcessedCb
bdlf::PlaceHolders::_4)); // fromReroute
|
|
||
| BSLS_ASSERT_SAFE( | ||
| d_initialConnectionContextCache.contains(initialConnectionContext_p)); | ||
| bsl::shared_ptr<InitialConnectionContext> initialConnectionContext_sp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the purpose of this seems to be to keep the object alive until this method returns since there is no other reference left. Is that correct? If so, it is worth a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment for d_initialConnectionContextCache in mqbnet::tcpsessionfactory.h.
| if (scheduleRc != 0) { | ||
| rc = (scheduleRc * 10) + rc_SCHEDULE_REAUTHN_FAILED; | ||
| error = scheduleErrStream.str(); | ||
| return; // RETURN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better still to respond (with an error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like failing to schedule reauthentication is more an internal error than something to notify the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should always respond to request, client logic may depend on it. In this case, it looks like an internal error and the response shoudl indicate an error.
| mqbnet::AuthenticationState::e_AUTHENTICATING // state | ||
| ); | ||
|
|
||
| context->setAuthenticationContext(authenticationContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant something like BSLS_ASSERT_SAFE(!d_authenticationCtxSp); in InitialConnectionContext::setAuthenticationContext
| } // close mutex lock guard // UNLOCK | ||
|
|
||
| // Keep track of active channels, for logging purposes | ||
| ++d_nbActiveChannels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall having any problems with d_nbActiveChannels, we do not seem to check it, only log
Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]> address feedback Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]>
Signed-off-by: Emelia Lei <[email protected]>
f1e6e18 to
d7bfeb3
Compare
dorjesinpo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think we always need to respond to auth request even in the case of internal error.
A minor point - since d_initialConnectionContextCache, there no need for bsl::enable_shared_from_this<InitialConnectionContext>
This PR introduces several refactors and enables authentication:
Authenticatorclass to encapsulate authentication logic.InitialConnectionHandlertoInitialConnectionContext.