Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced DistributedObjectAdmin to compare for inconsistencies between instances #426

Open
wants to merge 31 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
16e5127
Renamed HzObjectAdminController to DistributedObjectAdminController
jskupsik Nov 25, 2024
d189178
Standardized naming in getAdminStatsForObject()
jskupsik Nov 26, 2024
1849506
Improved ClusterService and ClusterConfig logging
jskupsik Nov 26, 2024
9d827a2
Moved distributed object code to DistributedObjectAdminService
jskupsik Nov 26, 2024
43f6886
ClusterHealthCheckService WIP
jskupsik Nov 27, 2024
787aa19
Merge remote-tracking branch 'refs/remotes/origin/develop' into multi…
jskupsik Nov 27, 2024
c1d848b
Renamed new feature to ClusterConsistencyCheck
jskupsik Nov 27, 2024
c5f4c67
WIP on admin tab
jskupsik Nov 28, 2024
0867cd9
WIP on admin tab
jskupsik Dec 5, 2024
f8dde4a
WIP on admin tab
jskupsik Dec 6, 2024
87d9656
WIP on admin tab
jskupsik Dec 9, 2024
3e0bad0
Merged old distributed objects tab with new changes
jskupsik Dec 9, 2024
423d2e7
WIP work
jskupsik Dec 10, 2024
58dabe2
CachedValue's initial entry (not entered/synced) no longer reports a …
jskupsik Dec 11, 2024
740261d
Bug fixes and name standardization
jskupsik Dec 11, 2024
4eb6e0e
WIP work
jskupsik Dec 12, 2024
14a390c
UI/filtering enhancements
jskupsik Dec 12, 2024
010ceec
Merge remote-tracking branch 'origin/develop' into multiInstanceCompare
jskupsik Dec 12, 2024
ae438e1
Updated CHANGELOG.md
jskupsik Dec 12, 2024
7be4e54
Added error handling, comparing objects
jskupsik Dec 17, 2024
961233f
Moved DistributedObjects to a 2nd level tab
jskupsik Dec 18, 2024
eaaed2f
Added non-comparison fields to detail grid
jskupsik Dec 18, 2024
3d54cb1
Removed the hashCode() check from Cache and CachedValue
jskupsik Dec 19, 2024
f930d0d
Merge branch 'develop' into multiInstanceCompare
amcclain Jan 2, 2025
46f864a
Update copyright date, fix package
amcclain Jan 2, 2025
78eca10
Code review changes
jskupsik Jan 2, 2025
4b8f2e3
Cleaned up getHzInfo/getHoistInfo symmetry
jskupsik Jan 2, 2025
8f1fe7d
Added comment to BaseService getComparisonFields() method
jskupsik Jan 2, 2025
d52a6fc
Merge branch 'develop' into multiInstanceCompare
lbwexler Jan 3, 2025
425c695
Fixed bug where server was not setting itself as primary on init
jskupsik Jan 3, 2025
139dcac
Checkpoint (#432)
lbwexler Jan 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

## 27.0-SNAPSHOT - unreleased

### 💥 Breaking Changes (upgrade difficulty: 🟢 LOW - Hoist React update)

* Requires `hoist-react >= 71` to support enhanced Distributed Objects page.

### 🎁 New Features

* `DistributedObjectAdminService` now compares certain `adminState` fields of distributed objects
between instances. Implement `BaseService.getComparisonFields()` to enumerate custom fields to
compare.

## 26.0.0 - 2024-12-02

### 💥 Breaking Changes (upgrade difficulty: 🟢 TRIVIAL - change to runOnInstance signature.)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import io.xh.hoist.util.Utils
import static io.xh.hoist.util.DateTimeUtils.SECONDS

import static grails.async.Promises.task

import static java.lang.Thread.sleep

@Access(['HOIST_ADMIN_READER'])
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
* This file belongs to Hoist, an application development toolkit
* developed by Extremely Heavy Industries (www.xh.io | [email protected])
*
* Copyright © 2022 Extremely Heavy Industries Inc.
*/
package io.xh.hoist.admin

import io.xh.hoist.BaseController
import io.xh.hoist.security.Access

@Access(['HOIST_ADMIN_READER'])
class DistributedObjectAdminController extends BaseController {
def distributedObjectAdminService

def getDistributedObjectsReport() {
renderJSON(distributedObjectAdminService.getDistributedObjectsReport())
}

@Access(['HOIST_ADMIN'])
def clearObjects() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below re: naming of these two endpoints to be more specific.

def req = parseRequestJSON()
distributedObjectAdminService.clearObjects(req.names)
renderJSON([success: true])
}

@Access(['HOIST_ADMIN'])
def clearHibernateCaches() {
distributedObjectAdminService.clearHibernateCaches()
renderJSON([success: true])
}
}

This file was deleted.

2 changes: 2 additions & 0 deletions grails-app/init/io/xh/hoist/ClusterConfig.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ class ClusterConfig {
Config createConfig() {
def ret = new Config()

System.out.println("ClusterConfig [INFO] | ${multiInstanceEnabled ? 'Multi-instance is enabled - instances will attempt to cluster.' : 'Multi-instance is disabled - instances will avoid clustering.'}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trying to mimic the log output is confusing --- can we just log this later in cluster service, when our logging is setup?


ret.instanceName = instanceName
ret.clusterName = clusterName
ret.memberAttributeConfig.setAttribute('instanceName', instanceName)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
/*
* This file belongs to Hoist, an application development toolkit
* developed by Extremely Heavy Industries (www.xh.io | [email protected])
*
* Copyright © 2024 Extremely Heavy Industries Inc.
*/
package io.xh.hoist.admin

import com.hazelcast.cache.impl.CacheProxy
import com.hazelcast.collection.ISet
import com.hazelcast.core.DistributedObject
import com.hazelcast.executor.impl.ExecutorServiceProxy
import com.hazelcast.map.IMap
import com.hazelcast.nearcache.NearCacheStats
import com.hazelcast.replicatedmap.ReplicatedMap
import com.hazelcast.ringbuffer.impl.RingbufferProxy
import com.hazelcast.topic.ITopic
import io.xh.hoist.BaseService
import io.xh.hoist.cluster.ClusterRequest
import io.xh.hoist.cluster.DistributedObjectInfo
import io.xh.hoist.cluster.DistributedObjectsReport

import javax.cache.expiry.Duration
import javax.cache.expiry.ExpiryPolicy

import static io.xh.hoist.util.Utils.appContext

class DistributedObjectAdminService extends BaseService {
def grailsApplication

DistributedObjectsReport getDistributedObjectsReport() {
def startTimestamp = System.currentTimeMillis(),
responsesByInstance = clusterService.submitToAllInstances(new ListDistributedObjects())
return new DistributedObjectsReport(
info: responsesByInstance.collectMany {it.value.value},
startTimestamp: startTimestamp,
endTimestamp: System.currentTimeMillis()
)
}

private List<DistributedObjectInfo> listDistributedObjects() {
// Services and their resources
Map<String, BaseService> svcs = grailsApplication.mainContext.getBeansOfType(BaseService.class, false, false)
def resourceObjs = svcs.collectMany { _, svc ->
[
// Services themselves
getInfo(obj: svc, name: svc.class.getName(), type: 'Service'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we could clarify with two methods that each just take a single object, and do type switching as needed -- e.g. getHoistInfo(Object obj) and getHzInfo(DistributedObject obj) -- symmetry!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I had with this was the fact that resource objects don't always know their full name - they often use a relative name. So a fully-symmetrical getHoistInfo(Object obj) might not have the info necessary to create a resource object entry. However, I can make it more symmetrical by taking object (required parameter) outside of the args (optional params).

// Resources, excluding those that are also DistributedObject
*svc.resources.findAll { k, v -> !(v instanceof DistributedObject)}.collect { k, v ->
getInfo(obj: v, name: svc.hzName(k))
}
]
},
// Distributed objects
hzObjs = clusterService
.hzInstance
.distributedObjects
.findAll { !(it instanceof ExecutorServiceProxy) }
.collect { getInfoForObject(it) }

return [*hzObjs, *resourceObjs].findAll{ it } as List<DistributedObjectInfo>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be findAll()

}
static class ListDistributedObjects extends ClusterRequest<List<DistributedObjectInfo>> {
List<DistributedObjectInfo> doCall() {
appContext.distributedObjectAdminService.listDistributedObjects()
}
}

void clearObjects(List<String> names) {
Copy link
Member

@lbwexler lbwexler Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets call this clearHibernateCaches and be specific! Could potentially combine with method below, or rename method below clearAllHibernateCaches

def all = clusterService.distributedObjects
names.each { name ->
def obj = all.find { it.getName() == name }
/** Keep in sync with frontend clear set - `DistributedObjectsModel.clearableTypes`. */
if (obj instanceof CacheProxy) {
obj.clear()
logInfo("Cleared " + name)
} else {
logWarn('Cannot clear object - unsupported type', name)
Copy link
Member

@lbwexler lbwexler Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obsolete log text -- should be hibernate cache not found.

}
}
}

void clearHibernateCaches() {
appContext.beanDefinitionNames
.findAll { it.startsWith('sessionFactory') }
.each { appContext.getBean(it)?.cache?.evictAllRegions() }
}

Map getAdminStatsForObject(DistributedObject obj) {
return getInfoForObject(obj)?.adminStats
}

DistributedObjectInfo getInfo(Map args) {
def obj = args.obj,
comparisonFields = null,
adminStats = null,
error = null

try {
comparisonFields = obj.hasProperty('comparisonFields') ? obj.comparisonFields : null
adminStats = obj.hasProperty('adminStats') ? obj.adminStats : null
} catch (Exception e) {
def msg = 'Error extracting admin stats'
logError(msg, e)
error = "$msg | $e.message"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer curly braces on $e.message.

}

return new DistributedObjectInfo(
comparisonFields: comparisonFields,
adminStats: adminStats,
error: error,
*: args
)
}

DistributedObjectInfo getInfoForObject(DistributedObject obj) {
switch (obj) {
case ReplicatedMap:
def stats = obj.getReplicatedMapStats()
return new DistributedObjectInfo(
comparisonFields: ['size'],
adminStats: [
name : obj.getName(),
type : 'ReplicatedMap',
size : obj.size(),
lastUpdateTime: stats.lastUpdateTime ?: null,
lastAccessTime: stats.lastAccessTime ?: null,

hits : stats.hits,
gets : stats.getOperationCount,
puts : stats.putOperationCount
]
)
case IMap:
def stats = obj.getLocalMapStats()
return new DistributedObjectInfo(
comparisonFields: ['size'],
adminStats: [
name : obj.getName(),
type : 'IMap',
size : obj.size(),
lastUpdateTime : stats.lastUpdateTime ?: null,
lastAccessTime : stats.lastAccessTime ?: null,

ownedEntryCount: stats.ownedEntryCount,
hits : stats.hits,
gets : stats.getOperationCount,
sets : stats.setOperationCount,
puts : stats.putOperationCount,
nearCache : getNearCacheStats(stats.nearCacheStats),
]
)
case ISet:
def stats = obj.getLocalSetStats()
return new DistributedObjectInfo(
comparisonFields: ['size'],
adminStats: [
name : obj.getName(),
type : 'ISet',
size : obj.size(),
lastUpdateTime: stats.lastUpdateTime ?: null,
lastAccessTime: stats.lastAccessTime ?: null,
]
)
case ITopic:
def stats = obj.getLocalTopicStats()
return new DistributedObjectInfo(
adminStats: [
name : obj.getName(),
type : 'Topic',
publishOperationCount: stats.publishOperationCount,
receiveOperationCount: stats.receiveOperationCount
]
)
case RingbufferProxy:
return new DistributedObjectInfo(
adminStats: [
name : obj.getName(),
type : 'Ringbuffer',
size : obj.size(),
capacity: obj.capacity()
]
)
case CacheProxy:
def evictionConfig = obj.cacheConfig.evictionConfig,
expiryPolicy = obj.cacheConfig.expiryPolicyFactory.create(),
stats = obj.localCacheStatistics
return new DistributedObjectInfo(
comparisonFields: ['size'],
adminStats: [
name : obj.getName(),
type : 'Hibernate Cache',
size : obj.size(),
lastUpdateTime : stats.lastUpdateTime ?: null,
lastAccessTime : stats.lastAccessTime ?: null,

ownedEntryCount : stats.ownedEntryCount,
cacheHits : stats.cacheHits,
cacheHitPercentage: stats.cacheHitPercentage?.round(0),
config : [
size : evictionConfig.size,
maxSizePolicy : evictionConfig.maxSizePolicy,
evictionPolicy: evictionConfig.evictionPolicy,
expiryPolicy : formatExpiryPolicy(expiryPolicy)
]
]
)
default:
return new DistributedObjectInfo(
adminStats: [
name: obj.getName(),
type: obj.class.toString()
]
)
}
}

//--------------------
// Implementation
//--------------------
private Map getNearCacheStats(NearCacheStats stats) {
if (!stats) return null
[
ownedEntryCount : stats.ownedEntryCount,
lastPersistenceTime: stats.lastPersistenceTime,
hits : stats.hits,
misses : stats.misses,
ratio : stats.ratio.round(2)
]
}

private Map formatExpiryPolicy(ExpiryPolicy policy) {
def ret = [:]
if (policy.expiryForCreation) ret.creation = formatDuration(policy.expiryForCreation)
if (policy.expiryForAccess) ret.access = formatDuration(policy.expiryForAccess)
if (policy.expiryForUpdate) ret.update = formatDuration(policy.expiryForUpdate)
return ret
}


private String formatDuration(Duration duration) {
if (duration.isZero()) return 0
if (duration.isEternal()) return 'eternal'
return duration.timeUnit.toSeconds(duration.durationAmount) + 's'
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ import io.xh.hoist.BaseService
class ServiceManagerService extends BaseService {

def grailsApplication,
clusterAdminService
distributedObjectAdminService

Collection<Map> listServices() {
getServicesInternal().collect { name, svc ->
Expand Down Expand Up @@ -51,7 +51,7 @@ class ServiceManagerService extends BaseService {
.findAll { !it.key.startsWith('xh_') } // skip hoist implementation objects
.collect { k, v ->
Map stats = v instanceof DistributedObject ?
clusterAdminService.getAdminStatsForObject(v) :
distributedObjectAdminService.getAdminStatsForObject(v) :
v.adminStats

// rely on the name (key) service knows, i.e avoid HZ prefix
Expand Down
Loading
Loading