-
Notifications
You must be signed in to change notification settings - Fork 2k
fix: run cohortpeople count query without joins #39925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -339,6 +339,29 @@ def test_insert_users_list_by_uuid(self): | |
assert cohort_person_uuids == set(uuids) | ||
assert cohort.is_calculating is False | ||
|
||
def test_static_cohort_size_validates_team(self): | ||
from posthog.models.cohort.util import get_static_cohort_size | ||
|
||
# Create another team in the same organization (different project) | ||
team2 = Team.objects.create(organization=self.organization) | ||
|
||
# Create people in both teams | ||
Person.objects.create(team=self.team, distinct_ids=["person1_team1"]) | ||
Person.objects.create(team=self.team, distinct_ids=["person2_team1"]) | ||
Person.objects.create(team=team2, distinct_ids=["person1_team2"]) | ||
|
||
# Create a static cohort in team1 | ||
cohort = Cohort.objects.create(team=self.team, is_static=True, name="test cohort") | ||
cohort.insert_users_by_list(["person1_team1", "person2_team1"]) | ||
|
||
# Count should work for the correct team | ||
count_correct = get_static_cohort_size(cohort_id=cohort.pk, team_id=self.team.pk) | ||
assert count_correct == 2 | ||
|
||
# Count should be 0 for a different team (validates team ownership) | ||
count_wrong_team = get_static_cohort_size(cohort_id=cohort.pk, team_id=team2.pk) | ||
assert count_wrong_team == 0 | ||
Comment on lines
+342
to
+363
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logic: Test doesn't cover the multi-team-in-same-project scenario where the bug occurs. The current test validates team isolation across different organizations, but doesn't test the case where multiple teams in the same project share a cohort. In Add a test case like: # Create two teams in the same project
team2 = Team.objects.create(organization=self.organization, project_id=self.team.project_id)
# Add people to the cohort from both teams
cohort.insert_users_by_list(["person1_team1"], team_id=self.team.pk)
cohort.insert_users_by_list(["person1_team2"], team_id=team2.pk)
# Each team should see only their count
assert get_static_cohort_size(cohort_id=cohort.pk, team_id=self.team.pk) == 1
assert get_static_cohort_size(cohort_id=cohort.pk, team_id=team2.pk) == 1 Prompt To Fix With AIThis is a comment left during a code review.
Path: posthog/test/test_cohort_model.py
Line: 342:363
Comment:
**logic:** Test doesn't cover the multi-team-in-same-project scenario where the bug occurs.
The current test validates team isolation across different organizations, but doesn't test the case where multiple teams in the same project share a cohort. In `recalculate_cohortpeople`, all teams in a project get CohortPeople entries for the same `cohort_id` but with different `team_id` values.
Add a test case like:
```python
# Create two teams in the same project
team2 = Team.objects.create(organization=self.organization, project_id=self.team.project_id)
# Add people to the cohort from both teams
cohort.insert_users_by_list(["person1_team1"], team_id=self.team.pk)
cohort.insert_users_by_list(["person1_team2"], team_id=team2.pk)
# Each team should see only their count
assert get_static_cohort_size(cohort_id=cohort.pk, team_id=self.team.pk) == 1
assert get_static_cohort_size(cohort_id=cohort.pk, team_id=team2.pk) == 1
```
How can I resolve this? If you propose a fix, please make it concise. |
||
|
||
def test_insert_users_by_list_avoids_duplicates_with_batching(self): | ||
"""Test that batching with duplicates works correctly - people already in cohort are not re-inserted.""" | ||
# Create people with distinct IDs | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Missing
team_id
filter in count query causes incorrect counts in multi-team projects.When
recalculate_cohortpeople
runs for a project with multiple teams (environments), it calls this function for each team. The samecohort_id
can have CohortPeople entries with differentteam_id
values. Without filtering byteam_id
, all teams get the same total count instead of team-specific counts.The ClickHouse table has
team_id
but the Django model doesn't expose it. Use raw SQL to filter properly:Prompt To Fix With AI