Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MBS-9253: List EP release groups above singles #3387

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion admin/sql/CreateFunctions.sql
Original file line number Diff line number Diff line change
Expand Up @@ -1928,7 +1928,12 @@ BEGIN
a_rg.artist,
-- Withdrawn releases were once official by definition
bool_and(r.status IS NOT NULL AND r.status != 1 AND r.status != 5),
rg.type::SMALLINT,
(CASE
WHEN rg.type = 3 THEN 2 -- Sort EPs above singles
WHEN rg.type = 2 THEN 3 -- Sort singles below EPs
ELSE rg.type
END
)::SMALLINT,
Comment on lines +1931 to +1936
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with your solution here as it's pretty straightforward, though I wondered if we should be using release_group_primary_type.child_order for this at first. Unless we want the general order of the types in a list to differ from the discography display order? (It would also require a new trigger on release_group_primary_type for when the child_order changes.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use that I guess, yes - it never even crossed my mind since I never think about it at all... Wouldn't that be more annoying though since it'd require new triggers and a join here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it'd also require one more join. The advantage would be that we could tweak the order again in the future without a schema change. (My only worry would be the trigger taking forever due to the number of rows needing to be updated after a child_order change. We could do it from psql with statement_timeout disabled, but it might time out from the UI.) Let's see what @yvanzo thinks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I think further order changes are unlikely enough that there might not be a point here - we rarely add primary types and I cannot think of any that would require special sorting.

For secondary types that might actually be useful though, if we decide to implement a specific order for those (MBS-13790).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if we should be using release_group_primary_type.child_order for this at first.

Yes, that seems obvious: child_order is purposed for ordering, not type.

require new trigger(s) […] taking forever

I’m not sure what trigger(s) you are speaking about.

About search, neither type nor child_order are looked after, only gid and name, so there won’t be any issue.

Copy link
Member Author

@reosarevok reosarevok Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure what trigger(s) you are speaking about.

Triggers to update artist_release_group whenever release_group_primary_type changes, since that means we need that data in here for the sort (and that means we also need an extra join here, of course). I don't expect to change the child_order often so it might be okay though 🤷‍♂️

array_agg(
DISTINCT st.secondary_type ORDER BY st.secondary_type)
FILTER (WHERE st.secondary_type IS NOT NULL
Expand Down
93 changes: 93 additions & 0 deletions admin/sql/updates/20241017-mbs-9253.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
\set ON_ERROR_STOP 1

BEGIN;

CREATE OR REPLACE FUNCTION get_artist_release_group_rows(
release_group_id INTEGER
) RETURNS SETOF artist_release_group AS $$
BEGIN
-- PostgreSQL 12 generates a vastly more efficient plan when only
-- one release group ID is passed. A condition like
-- `rg.id = any(...)` can be over 200x slower, even with only one
-- release group ID in the array.
RETURN QUERY EXECUTE $SQL$
SELECT DISTINCT ON (a_rg.artist, rg.id)
a_rg.is_track_artist,
a_rg.artist,
-- Withdrawn releases were once official by definition
bool_and(r.status IS NOT NULL AND r.status != 1 AND r.status != 5),
(CASE
WHEN rg.type = 3 THEN 2 -- Sort EPs above singles
WHEN rg.type = 2 THEN 3 -- Sort singles below EPs
ELSE rg.type
END
)::SMALLINT,
array_agg(
DISTINCT st.secondary_type ORDER BY st.secondary_type)
FILTER (WHERE st.secondary_type IS NOT NULL
)::SMALLINT[],
integer_date(
rgm.first_release_date_year,
rgm.first_release_date_month,
rgm.first_release_date_day
),
left(rg.name, 1)::CHAR(1),
rg.id
FROM (
SELECT FALSE AS is_track_artist, rgacn.artist, rg.id AS release_group
FROM release_group rg
JOIN artist_credit_name rgacn ON rgacn.artist_credit = rg.artist_credit
UNION ALL
SELECT TRUE AS is_track_artist, tacn.artist, r.release_group
FROM release r
JOIN medium m ON m.release = r.id
JOIN track t ON t.medium = m.id
JOIN artist_credit_name tacn ON tacn.artist_credit = t.artist_credit
) a_rg
JOIN release_group rg ON rg.id = a_rg.release_group
LEFT JOIN release r ON r.release_group = rg.id
JOIN release_group_meta rgm ON rgm.id = rg.id
LEFT JOIN release_group_secondary_type_join st ON st.release_group = rg.id
$SQL$ || (CASE WHEN release_group_id IS NULL THEN '' ELSE 'WHERE rg.id = $1' END) ||
$SQL$
GROUP BY a_rg.is_track_artist, a_rg.artist, rgm.id, rg.id
ORDER BY a_rg.artist, rg.id, a_rg.is_track_artist
$SQL$
USING release_group_id;
END;
$$ LANGUAGE plpgsql;

-- We update the table for any existing RGs of type Single or EP
-- with this one-off script; the updated function will keep it up after this.
DO $$
DECLARE
release_group_ids INTEGER[];
release_group_id INTEGER;
BEGIN
SELECT array_agg(DISTINCT rg.id)
INTO release_group_ids
FROM release_group rg
WHERE rg.type = 2 OR rg.type = 3; -- Single or EP

IF coalesce(array_length(release_group_ids, 1), 0) > 0 THEN
-- If the user hasn't generated `artist_release_group`, then we
-- shouldn't update or insert to it. MBS determines whether to
-- use this table based on it being non-empty, so a partial
-- table would manifest as partial data on the website and
-- webservice.
PERFORM 1 FROM artist_release_group LIMIT 1;
IF FOUND THEN
DELETE FROM artist_release_group WHERE release_group = any(release_group_ids);

FOREACH release_group_id IN ARRAY release_group_ids LOOP
-- We handle each release group ID separately because
-- the `get_artist_release_group_rows` query can be
-- planned much more efficiently that way.
INSERT INTO artist_release_group
SELECT * FROM get_artist_release_group_rows(release_group_id);
END LOOP;
END IF;
END IF;
END $$;

COMMIT;
99 changes: 99 additions & 0 deletions admin/sql/updates/schema-change/30.all.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
-- Generated by CompileSchemaScripts.pl from:
-- 20241017-mbs-9253.sql
\set ON_ERROR_STOP 1
BEGIN;
SET search_path = musicbrainz, public;
SET LOCAL statement_timeout = 0;
--------------------------------------------------------------------------------
SELECT '20241017-mbs-9253.sql';


CREATE OR REPLACE FUNCTION get_artist_release_group_rows(
release_group_id INTEGER
) RETURNS SETOF artist_release_group AS $$
BEGIN
-- PostgreSQL 12 generates a vastly more efficient plan when only
-- one release group ID is passed. A condition like
-- `rg.id = any(...)` can be over 200x slower, even with only one
-- release group ID in the array.
RETURN QUERY EXECUTE $SQL$
SELECT DISTINCT ON (a_rg.artist, rg.id)
a_rg.is_track_artist,
a_rg.artist,
-- Withdrawn releases were once official by definition
bool_and(r.status IS NOT NULL AND r.status != 1 AND r.status != 5),
(CASE
WHEN rg.type = 3 THEN 2 -- Sort EPs above singles
WHEN rg.type = 2 THEN 3 -- Sort singles below EPs
ELSE rg.type
END
)::SMALLINT,
array_agg(
DISTINCT st.secondary_type ORDER BY st.secondary_type)
FILTER (WHERE st.secondary_type IS NOT NULL
)::SMALLINT[],
integer_date(
rgm.first_release_date_year,
rgm.first_release_date_month,
rgm.first_release_date_day
),
left(rg.name, 1)::CHAR(1),
rg.id
FROM (
SELECT FALSE AS is_track_artist, rgacn.artist, rg.id AS release_group
FROM release_group rg
JOIN artist_credit_name rgacn ON rgacn.artist_credit = rg.artist_credit
UNION ALL
SELECT TRUE AS is_track_artist, tacn.artist, r.release_group
FROM release r
JOIN medium m ON m.release = r.id
JOIN track t ON t.medium = m.id
JOIN artist_credit_name tacn ON tacn.artist_credit = t.artist_credit
) a_rg
JOIN release_group rg ON rg.id = a_rg.release_group
LEFT JOIN release r ON r.release_group = rg.id
JOIN release_group_meta rgm ON rgm.id = rg.id
LEFT JOIN release_group_secondary_type_join st ON st.release_group = rg.id
$SQL$ || (CASE WHEN release_group_id IS NULL THEN '' ELSE 'WHERE rg.id = $1' END) ||
$SQL$
GROUP BY a_rg.is_track_artist, a_rg.artist, rgm.id, rg.id
ORDER BY a_rg.artist, rg.id, a_rg.is_track_artist
$SQL$
USING release_group_id;
END;
$$ LANGUAGE plpgsql;

-- We update the table for any existing RGs of type Single or EP
-- with this one-off script; the updated function will keep it up after this.
DO $$
DECLARE
release_group_ids INTEGER[];
release_group_id INTEGER;
BEGIN
SELECT array_agg(DISTINCT rg.id)
INTO release_group_ids
FROM release_group rg
WHERE rg.type = 2 OR rg.type = 3; -- Single or EP

IF coalesce(array_length(release_group_ids, 1), 0) > 0 THEN
-- If the user hasn't generated `artist_release_group`, then we
-- shouldn't update or insert to it. MBS determines whether to
-- use this table based on it being non-empty, so a partial
-- table would manifest as partial data on the website and
-- webservice.
PERFORM 1 FROM artist_release_group LIMIT 1;
IF FOUND THEN
DELETE FROM artist_release_group WHERE release_group = any(release_group_ids);

FOREACH release_group_id IN ARRAY release_group_ids LOOP
-- We handle each release group ID separately because
-- the `get_artist_release_group_rows` query can be
-- planned much more efficiently that way.
INSERT INTO artist_release_group
SELECT * FROM get_artist_release_group_rows(release_group_id);
END LOOP;
END IF;
END IF;
END $$;

COMMIT;
10 changes: 8 additions & 2 deletions lib/MusicBrainz/Server/Data/ReleaseGroup.pm
Original file line number Diff line number Diff line change
Expand Up @@ -349,14 +349,20 @@ sub _find_by_artist_slow
ON rgstj.secondary_type = rgst.id
WHERE rgstj.release_group = rg.id
ORDER BY name ASC
) secondary_types
) secondary_types,
(CASE
WHEN rg.type = 3 THEN 2 -- Sort EPs above singles
WHEN rg.type = 2 THEN 3 -- Sort singles below EPs
ELSE rg.type
END
) as sorted_type
FROM ' . $self->_table . '
JOIN artist_credit_name acn
ON acn.artist_credit = rg.artist_credit
' . join(' ', @$extra_joins) . '
WHERE ' . join(' AND ', @$conditions) . '
ORDER BY
rg.type, secondary_types,
sorted_type, secondary_types,
rgm.first_release_date_year,
rgm.first_release_date_month,
rgm.first_release_date_day,
Expand Down
5 changes: 5 additions & 0 deletions upgrade.json
Original file line number Diff line number Diff line change
Expand Up @@ -230,5 +230,10 @@
"20240223-mbs-13421-fks.sql",
"20240319-mbs-13514.sql"
]
},
"30": {
"all": [
"20241017-mbs-9253.sql"
]
}
}