Skip to content

Would it be possible to default the type fields? #157

@bluenote10

Description

@bluenote10

Currently the construction of the geometry types is a bit verbose and non-DRY, because for each type it is necessary to specify the type discriminator in the code. I.e., the construction looks like this:

  • Point(type="Point", ...)
  • MultiPoint(type="MultiPoint", ...)
  • ...

Having to specify the type make these expressions very long and the type field just repeats an information that is basically expressed by the type (i.e. class) already.

Would you be fine with introducing defaults for these Literals? This would greatly simplify the construction, and since the Geometry type declares the type field as a discriminator for the union, it is still mandatory for the parsing as desired. To demonstrate in a minified example:

from typing import Literal

from pydantic import BaseModel, Field, TypeAdapter, ValidationError
from typing_extensions import Annotated

class Point(BaseModel):
    type: Literal["Point"] = "Point"

class MultiPoint(BaseModel):
    type: Literal["MultiPoint"] = "MultiPoint"

Geometry = Annotated[
    Point | MultiPoint,
    Field(discriminator="type"),
]

# Simplifies construction:
a = Point()
b = MultiPoint()

# Verify parsing
type_adapter = TypeAdapter[Geometry](Geometry)
assert isinstance(
    type_adapter.validate_json('{"type": "Point"}'),
    Point,
)
assert isinstance(
    type_adapter.validate_json('{"type": "MultiPoint"}'),
    MultiPoint,
)

# Parsing a `Geometry` still requires the `type` field, because it is the discriminator.
try:
    type_adapter.validate_json("{}")
except ValidationError as e:
    print(f"\nAs desired, trying to validate without a `type` field fails with:\n{e}")

# Note that it is also not possible to parse a specific type with the "wrong" `type` field:
try:
    Point.model_validate_json('{"type": "MultiPoint"}')
except ValidationError as e:
    print(f"\nAs desired, trying to validate with wrong type fails with::\n{e}")

The only behavior that would change is that it would now in principle be possible to parse either a Point or MultiPoint from a data structure without a type field. But I think this is fine because:

  • If the input follows RFC 7946, it has to have a type field, i.e., this case cannot occur for proper GeoJSON.
  • For the rare case that the input does not follow RFC 7946 and the type field is missing, there is no clear right or wrong how to handle it anyway. If a user tries to parse a Point from such "broken" GeoJSON it may be fine if the parsing succeeds as long as the rest of the data payload fits. Note that this will also only allow parsing specific types like Point, MultiPoint, ... but not a general Geometry (because it needs the discriminator), so such cases seem rather pathological.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions