Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: tags skip param isn't working as expected #394

Merged
merged 10 commits into from
Mar 31, 2025
Merged

bug: tags skip param isn't working as expected #394

merged 10 commits into from
Mar 31, 2025

Conversation

tipogi
Copy link
Collaborator

@tipogi tipogi commented Mar 26, 2025

Summary

This PR introduces two key improvements to reduce unnecessary database access and improve efficiency when interacting with sorted set indices and tag data.

1. Conditional Fallback to Graph on Sorted Set Miss

Previously, the system would fall back to the graph database whenever a sorted set query returned no results—regardless of whether the index actually existed. This caused unnecessary graph lookups even when the cache index was present but the skip/limit filters resulted in no remaining items.

The updated implementation now distinguishes between:

  • The index not existing — triggers a graph lookup to populate the cache
  • The index existing but exhausted due to filters — skips the graph lookup and simply returns an empty result

This change eliminates redundant graph queries and enforces a clear separation between cache absence and pagination exhaustion.

2. Deferred Tag Index Lookup Based on Tag Count

In the previous implementation, when retrieving user or post details, all related queries—including tags—were executed in parallel. This meant that even when a user or post had no tags, the system would still query the tag index. Since the index doesn't exist in those cases, these lookups were unnecessary and inefficient.

The updated implementation moves the tag-related lookup out of the parallel execution flow. We now compute tag counts first, and only query the tag index if the count is greater than zero. If there are no tags, the query is skipped entirely, and an empty array is returned.

Pre-submission Checklist

For tests to work you need a working neo4j and redis instance with the example dataset in docker/db-graph

  • Testing: Implement and pass new tests for the new features/fixes, cargo nextest run.
  • Performance: Ensure new code has relevant performance benchmarks, cargo bench -p nexus-api

@tipogi tipogi added bug Something isn't working 🌐 service labels Mar 26, 2025
@tipogi tipogi self-assigned this Mar 26, 2025
Copy link
Collaborator

@SHAcollision SHAcollision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not test, but if it works as expected for you, let's go! Maybe one test case to cover the correctness of this fix could be added if anything.

Comment on lines 55 to 58
if tags.len() == 0 {
Err(Error::EmptyStream {
message: "No tags found for the given criteria.".to_string(),
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"No tags provided" ??

@tipogi tipogi changed the title 204 response when filters do not match and index exists bug: tags skip param isn't working as expected Mar 27, 2025
@tipogi tipogi marked this pull request as ready for review March 27, 2025 18:12
@tipogi tipogi requested a review from SHAcollision March 27, 2025 18:12
Copy link
Collaborator

@SHAcollision SHAcollision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @tipogi ! 🚀 The changes improve caching logic, reduce unnecessary queries (e.g., for tags when counts are zero), and add valuable documentation, the main area for improvement is the consistency in handling “empty” results across different layers of the application.

We should decide on a uniform pattern for empty responses and if possible update both the API and maybe the internal function signatures accordingly.

///
/// * `items` - The vector of items to be returned in the JSON response
/// * `model` - A string used in the error message to indicate which kind of item was expected
pub fn as_json_or_error<T>(items: Vec<T>, model: &str) -> Result<Json<Vec<T>>> {
Copy link
Collaborator

@SHAcollision SHAcollision Mar 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand this well. It seems on some endpoints (like notifications, search/tags, and user/tags), empty results are now treated as an error via as_json_or_error ? whereas in other parts of the code (for example, in some internal functions in the post stream), an empty vector is returned directly. It might be clearer to decide on one consistent approach for empty responses across the API: either always returning an empty array (with a 200/204 status) or returning an error that maps to a specific status code. I guess we always had this inconsistency.

Copy link
Collaborator

@SHAcollision SHAcollision Mar 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP Semantics: In REST, an empty list is often acceptable. Many APIs simply return a 200 with an empty array or use a 204 “No Content” status without a body (this is what we are doing).

The function name as_json_or_error is a bit vague. A more descriptive name (for example, json_arrat_or_no_content_response ) might help. I guess the bigger confusion is here, because we call this "error" and we raise an "error" however our Error::EmptyStream maps to StatusCode::NoContent (204) and that's not an error response, but a successful response.

The generic parameter T isn’t bounded by a trait (like Serialize), which might cause issues if used with types that aren’t serializable. Adding a constraint (T: Serialize) would be safer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! The fn xx_error name was misleading, since the response isn't treated as an actual error — it just returns an empty array case as 204, No Content. I believe we had already defined this behavior as our standard across the API, but it looks like some endpoints hadn't adopted it yet. Fixed that now!

pub fn as_json_or_error<T>(items: Vec<T>, model: &str) -> Result<Json<Vec<T>>> {
if items.is_empty() {
Err(Error::EmptyStream {
message: format!("No {:?} found for the given criteria", model),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model is a &str so {:?} (debug display) could be just {} to avoid unnecessary quotes.

@@ -130,7 +130,9 @@ pub async fn hot_tags_handler(Query(query): Query<HotTagsQuery>) -> Result<Json<

match HotTags::get_hot_tags(query.user_id, query.reach, &input).await {
Ok(Some(hot_tags)) => Ok(Json(hot_tags)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need as_json_or_error() here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, there are some models that does already that internal work when the array is empty and it converts the result as a None type. In that line, 132, only will enter the results that have content

@@ -90,6 +90,7 @@ where
{
Some(tag_details) => Ok(Some(tag_details)),
None => {
println!("Get from graph");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover println!

@SHAcollision
Copy link
Collaborator

The updated implementation now distinguishes between:

  • The index not existing — triggers a graph lookup to populate the cache
  • The index existing but exhausted due to filters — skips the graph lookup and simply returns an empty result

Could the sorted set exist and be empty? If the sorted set is only created when it has at least 1 element we will always be querying the graph on every request for a post that has no tags. Do we need graph fallback for this at all?

@SHAcollision
Copy link
Collaborator

Looks good! Ready to go. We will look in detail into whether all API calls are optimized and harmonization.

@SHAcollision SHAcollision self-requested a review March 31, 2025 18:25
@tipogi tipogi merged commit c7bef40 into main Mar 31, 2025
3 checks passed
@tipogi tipogi deleted the bug/tag-skip branch March 31, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 🌐 service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants