Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(CommonPlayersStrategy): add index to reduce suggestions bottleneck #3132

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
use App\Platform\Data\GameSuggestionContextData;
use App\Platform\Enums\GameSuggestionReason;
use Illuminate\Support\Collection;
use Illuminate\Support\Facades\DB;

class CommonPlayersStrategy implements GameSuggestionStrategy
{
Expand All @@ -39,7 +38,7 @@ public function select(): ?Game
->whereAllAchievementsUnlocked()
->where('achievements_total', '>', 0)
->where('game_id', '!=', $this->sourceGame->id)
->whereNull('deleted_at')
->withTrashed()
->groupBy('game_id')
->orderByRaw('COUNT(*) DESC')
->limit(10)
Expand Down Expand Up @@ -76,44 +75,14 @@ public function reasonContext(): ?GameSuggestionContextData
* @return Collection<int, int>
*/
private function getMasterUserIds(): Collection
{
$connection = DB::connection()->getDriverName();

return match ($connection) {
'sqlite' => $this->getMasterUserIdsSQLite(),
default => $this->getMasterUserIdsMariaDB(),
};
}

/**
* @return Collection<int, int>
*/
private function getMasterUserIdsMariaDB(): Collection
{
// Use the optimized index hint for MariaDB.
// This is unfortunately not supported by SQLite.
return PlayerGame::from(DB::raw('`player_games` FORCE INDEX (player_games_completion_sample_idx)'))
->where('game_id', $this->sourceGame->id)
->where('user_id', '!=', $this->user->id)
->whereAllAchievementsUnlocked()
->whereNull('deleted_at')
->whereRaw('id % 100 < 5')
->limit(5)
->pluck('user_id');
}

/**
* @return Collection<int, int>
*/
private function getMasterUserIdsSQLite(): Collection
{
// For SQLite, we'll use a different approach to get a stable random sample.
// We use the rowid (which SQLite automatically provides) for sampling.
return PlayerGame::where('game_id', $this->sourceGame->id)
->where('user_id', '!=', $this->user->id)
->whereAllAchievementsUnlocked()
->whereNull('deleted_at')
->orderByRaw('rowid % 100')
->withTrashed()
->whereRaw('id % 100 < 5')
->limit(5)
->pluck('user_id');
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<?php

declare(strict_types=1);

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class() extends Migration {
public function up(): void
{
Schema::table('player_games', function (Blueprint $table) {
$table->dropIndex('player_games_completion_sample_idx');

$table->index([
'user_id',
'achievements_unlocked',
'achievements_total',
'game_id',
], 'player_games_suggestions_index'); // custom name needed because the auto-generated one is too long
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the player_games_completion_sample_idx, but with the additional deleted_at column. Do we need both?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's currently 0 deleted player_games records.

Maybe you could remove the deleted_at check form the query and leverage the existing index? A little deleted data isn't going to matter for the outcome of random selection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's unfortunately still needed:

Screenshot 2025-01-26 at 10 48 44 AM

The screenshot above is with the index removing that column. Eloquent deciding to automagically include the column is what necessitates it being part of the compound index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

->withTrashed()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works - fixed in latest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. As soon as I delete it, I see these monster queries appear:
Screenshot 2025-01-26 at 10 59 31 AM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not leveraging the old index, there's no reason to remove the deleted_at clause.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm happy to implement feedback, but I'm getting confused. What specifically would you like me to do?

My interpretation is you'd like me to add back deleted_at to the index as well as the queries, but we don't have any deleted records.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping you'd be able to use the existing player_games_completion_sample_idx index, but it sounds like that's not being picked up.

 KEY `player_games_completion_sample_idx` (`game_id`,`achievements_unlocked`,`achievements_total`,`user_id`,`id`),

I see now that that index has id in it - why would you ever want id in an index? Maybe we can drop it and replace it with yours.

If you're going to have to add a new index, do whatever you feel is necessary with it - like filtering out the deleted items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

player_games_completion_sample_idx is a covering index, but id is probably redundant.

Based on what I'm seeing in my local, we don't even need that index anymore and a lot of the code in the strategy can be simplified.

I've updated the new migration and the strategy in latest.

}

public function down(): void
{
Schema::table('player_games', function (Blueprint $table) {
$table->dropIndex('player_games_suggestions_index');

$table->index(
[
'game_id',
'achievements_unlocked',
'achievements_total',
'user_id',
'id',
],
'player_games_completion_sample_idx' // custom name needed because the auto-generated one is too long
);
});
}
};