Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Would it be possible to default the type fields? #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bluenote10 opened this issue Apr 10, 2024 · 2 comments
Closed

Would it be possible to default the type fields? #157

bluenote10 opened this issue Apr 10, 2024 · 2 comments

Comments

@bluenote10
Copy link

Currently the construction of the geometry types is a bit verbose and non-DRY, because for each type it is necessary to specify the type discriminator in the code. I.e., the construction looks like this:

  • Point(type="Point", ...)
  • MultiPoint(type="MultiPoint", ...)
  • ...

Having to specify the type make these expressions very long and the type field just repeats an information that is basically expressed by the type (i.e. class) already.

Would you be fine with introducing defaults for these Literals? This would greatly simplify the construction, and since the Geometry type declares the type field as a discriminator for the union, it is still mandatory for the parsing as desired. To demonstrate in a minified example:

from typing import Literal

from pydantic import BaseModel, Field, TypeAdapter, ValidationError
from typing_extensions import Annotated

class Point(BaseModel):
    type: Literal["Point"] = "Point"

class MultiPoint(BaseModel):
    type: Literal["MultiPoint"] = "MultiPoint"

Geometry = Annotated[
    Point | MultiPoint,
    Field(discriminator="type"),
]

# Simplifies construction:
a = Point()
b = MultiPoint()

# Verify parsing
type_adapter = TypeAdapter[Geometry](Geometry)
assert isinstance(
    type_adapter.validate_json('{"type": "Point"}'),
    Point,
)
assert isinstance(
    type_adapter.validate_json('{"type": "MultiPoint"}'),
    MultiPoint,
)

# Parsing a `Geometry` still requires the `type` field, because it is the discriminator.
try:
    type_adapter.validate_json("{}")
except ValidationError as e:
    print(f"\nAs desired, trying to validate without a `type` field fails with:\n{e}")

# Note that it is also not possible to parse a specific type with the "wrong" `type` field:
try:
    Point.model_validate_json('{"type": "MultiPoint"}')
except ValidationError as e:
    print(f"\nAs desired, trying to validate with wrong type fails with::\n{e}")

The only behavior that would change is that it would now in principle be possible to parse either a Point or MultiPoint from a data structure without a type field. But I think this is fine because:

  • If the input follows RFC 7946, it has to have a type field, i.e., this case cannot occur for proper GeoJSON.
  • For the rare case that the input does not follow RFC 7946 and the type field is missing, there is no clear right or wrong how to handle it anyway. If a user tries to parse a Point from such "broken" GeoJSON it may be fine if the parsing succeeds as long as the rest of the data payload fits. Note that this will also only allow parsing specific types like Point, MultiPoint, ... but not a general Geometry (because it needs the discriminator), so such cases seem rather pathological.
@vincentsarago
Copy link
Member

type is a required key from the specification POV, and geojson-pydantic is trying to match it as close a possible.

This topic has been discussed multiple times already:

I'm sorry it makes things verbose but at the same time this is the only way to make sure a geojson object is valid.

Taking your example 👇 will work, while this is not a valid Point Geometry object

Point.model_validate_json('{}')

@bluenote10
Copy link
Author

Taking your example 👇 will work, while this is not a valid Point Geometry object

I tried to cover this example in the last paragraph, especially the last bullet point.

From my experience the vast majority of use cases parses GeoJSON on a higher level like Feature, FeatureCollection, or GeometryCollection. In all these use cases an invalid geometry like that (missing the type field) would still be detected as such, because it is needed as a discriminator.

Note that the example we are talking about is very rare, because:

  • The generated GeoJSON has to be invalid to begin with, and in a very specific way: The only the type field is allowed to be missing, the rest of the payload must be valid.
  • It must be a use cases which parses the data on the lowest level like Point, which are rather rare from my experience.

On the other hand, the repetitiveness in the construction is rather pervasive, and is something that jumps into the face of every developer working with the library (triggering e.g. discussions in code reviews about the ugliness of the code). Therefore, it would be a case where I'd trade-off ergonomics in favor of catering for the needs of a rare use case, especially since the resulting set of accepted GeoJSON is basically a superset of the specs.

@developmentseed developmentseed locked and limited conversation to collaborators Apr 10, 2024
@vincentsarago vincentsarago converted this issue into discussion #158 Apr 10, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants