Skip to content

INTPYTHON-624 Add PolymorphicEmbeddedModelField #327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

timgraham
Copy link
Collaborator

@timgraham timgraham commented Jun 25, 2025

No description provided.

if transform:
return transform
field = None
for model in self.embedded_models:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if multiple submodes has the same field name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called this out in the design doc. I think we'll have to use the system check framework to enforce that common field names at least use the same type, otherwise, we won't know how to prepare the lookup value (e.g. the DatabaseOperations.adapt_<field>_value() methods).

Thinking about it some more, there is also the possibility that nested embedded documents share a field name. In that case, we won't know which field to traverse for the nested lookups that follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also the possibility that nested embedded documents share a field name

This would only matter in the case of nested polymorphic fields, right? I think we can maintain that polymorphic fields of the same name should also require the same embedded models to define them? This then keeps us in alignment with the requirement that all name conflicts "are of the same type".

class Colonial(EmbeddedModel):
    ...

class Townhome(EmbeddedModel):
    ...

class Condo(EmbeddedModel):
    ...
    
class AmericanAddress(EmbeddedModel):
    city = models.CharField(...)
    state = models.CharField(...)
    zip_code = models.CharField(...)
    home_specifications = PolymorphicEmbeddedModelField([Colonial, Townhome, Condo],...)
    

class CanadianAddress(EmbeddedModel):
    city = models.CharField(...)
    province = models.CharField(...)
    zip_code = models.CharField(...)
    home_specifications = PolymorphicEmbeddedModelField([Colonial, Townhome, Condo],...)
    
class Home(models.Model):
    address = PolymorphicEmbeddedModelField([AmericanAddress, CanadianAddress], ...)
    num_inhabitants = models.IntegerField(...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a reasonable restriction. Whether it's worth adding a system check for that at this stage, I'm doubtful. There are many other cases where the current system check falls short. For example, the db_type() of EmbeddededModelField and the polymorphic version are both "embeddedDocuments", but there could be an incompatibility there. It just seems like a lot of various cases and combinations to consider, and they are technically okay, except for the querying issue which the user may or may not care about (thus a system check warning rather than an error).

@timgraham timgraham force-pushed the polymorphic branch 2 times, most recently from fba32d9 to 817d03a Compare July 1, 2025 00:49
@timgraham timgraham marked this pull request as ready for review July 1, 2025 01:13
)
field_name = field.name
if existing_field := embedded_fields.get(field.name):
connection = _get_mongodb_connection()
Copy link
Collaborator

@WaVEV WaVEV Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should run anytime? or only when a migration is applied on the database? 🤔
Given that make migrations does not pass the database, because it should be agnostic to that but when a migration is applied it should not be agnostic. So when do we want to check this? when a migration is created, or anytime?

Edit: change the way that field types are compared is viable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intended to run at the makemigrations stage in order to flag the issue as early as possible. I think it's better for all database-specific checks to run during makemigrations but this requires changes to Django as discussed in https://code.djangoproject.com/ticket/31286#comment:7.

Copy link
Contributor

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished looking at the tests, but I've given a look at the code. Overall, I'm in alignment with the approach and am even down to discuss "doubling-down" on embedded model usage. We can weigh the pros and cons outside of this PR.

if transform:
return transform
field = None
for model in self.embedded_models:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also the possibility that nested embedded documents share a field name

This would only matter in the case of nested polymorphic fields, right? I think we can maintain that polymorphic fields of the same name should also require the same embedded models to define them? This then keeps us in alignment with the requirement that all name conflicts "are of the same type".

class Colonial(EmbeddedModel):
    ...

class Townhome(EmbeddedModel):
    ...

class Condo(EmbeddedModel):
    ...
    
class AmericanAddress(EmbeddedModel):
    city = models.CharField(...)
    state = models.CharField(...)
    zip_code = models.CharField(...)
    home_specifications = PolymorphicEmbeddedModelField([Colonial, Townhome, Condo],...)
    

class CanadianAddress(EmbeddedModel):
    city = models.CharField(...)
    province = models.CharField(...)
    zip_code = models.CharField(...)
    home_specifications = PolymorphicEmbeddedModelField([Colonial, Townhome, Condo],...)
    
class Home(models.Model):
    address = PolymorphicEmbeddedModelField([AmericanAddress, CanadianAddress], ...)
    num_inhabitants = models.IntegerField(...)

Comment on lines 290 to 292
Additionally, querying into nested embedded model fields with the same name
isn't well supported: the first model in ``embedded_models`` is the one that
will be used for lookups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be where we add the warning to try and maintain consistency as best as possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A warning when querying? Probably a system check warning to flag it at model setup time similar to the check I already implemented would be more useful in case the user wants to redefine their models. As I mentioned in the other comment, I'm not sure it's worth doing right now.

@@ -12,6 +12,8 @@
class EmbeddedModelField(models.Field):
"""Field that stores a model instance."""

value_is_model_instance = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just passing by, could this be is_model_instance or is it important to know that the field is considered a "value" to something else ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in love with the name either. It means "the field's value is a model instance." An alternative could be Field.stores_model_instance. It's not a big deal, since it's merely a private API in order to allow the check in django_mongodb_backend/compiler.py to be more generic.

@@ -13,6 +14,7 @@
"EmbeddedModelField",
"ObjectIdAutoField",
"ObjectIdField",
"PolymorphicEmbeddedModelField",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by comment. I am not trying to convince anyone of anything, but I would love to use this feature if it were called GenericEmbeddedModelField. Apologies but I just cannot with the Polymorphic… for two reasons:

  • What is Polymorphic? I have to look it up every single time.
  • Generic is shorter than Polymorphic
  • Generic has some context in Django

Unlike my concern about using Embedded… terminology, there may be some justification that Generic could be used instead of Polymorphic and still be appropriate (whereas with Embedded terminology we're purposely steering away from SQL terminology.)

Copy link
Contributor

@Jibola Jibola Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR -- Polymorphic is a more holistically accurate representation of this field and avoids confusion.

I'm against the use of GenericEmbeddedModelField for several reasons, but the most pertinent ones are:

  1. It's the incorrect usage of the term Generic

    • Our polymorphism (what's commonly viewed as polymorphism) abides by subtype polymorphism which focuses more on fields sharing commonality, whereas Generics abide by parametric polymorphism which emphasize a set of "actions" available to any object/value passed. The contents of that object mean much less than what it can actually do. See the blurb below:

We have already seen that object-oriented languages support subtype polymorphism; generics give us a different kind of polymorphism, parametric polymorphism. Recall that polymorphism means that something can be used at multiple types. Subtype polymorphism allows client code to be written with respect to a single type (as specified by, say, an interface or abstract class) while interoperating with multiple implementations of that type. Parametric polymorphism, also known as generics, solves the opposite problem, in a sense: it allows a single implementation to interoperate with multiple clients who want to use that implementation for different types. This goal is achieved by writing the implementation with respect to a type parameter, a variable ranging over types that can be instantiated with different concrete types. Support for generics is not an essentially object-oriented idea, and was not originally part of Java, having been first introduced in Java 5.

Source

  1. It would further confuse folks

    • In both Django Generic Relations and the, now defunct, Django-MongoDB-Engine Generic Embedding "Generics" implement the properties defined in the point above. Rather than a list of explicitly defined items like in our polymorphic design, it takes quite literally anything and will make it work in its system. Were we to also call it a Generic* it would confuse developers because they could expect those qualities instead of realizing things still need to be explicitly typed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants