Skip to content

Improvements to SchemaGenerator #172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tomchristie opened this issue Nov 1, 2018 · 18 comments
Closed

Improvements to SchemaGenerator #172

tomchristie opened this issue Nov 1, 2018 · 18 comments
Labels
clean up Refinement to existing functionality

Comments

@tomchristie
Copy link
Member

tomchristie commented Nov 1, 2018

  • Include mounted paths.
  • Ensure that OpenAPI path parameters in doctsrings match up with the router path parameters.
  • Automatically add any path parameters that are not already included?

There's also a stack of really nice third party work that we can do here, eg packaging up Swagger UI or Redoc or API Star into API docs apps.

Contributor work towards any of the above would be very much welcome! 😄

@tiangolo
Copy link
Member

tiangolo commented Nov 4, 2018

First, Starlette source code is a pleasure to read.
It's been a long time since I enjoyed reading an already established project's source code like this one. Awesome 👏 🍰 🌮


Here are my plans.

I imagine you would prefer me to build it as a separate project to not bloat Starlette, to keep it as pure ASGI as possible and to avoid putting very interdependent stuff together here.

But if you think it would be acceptable/desirable to have it integrated into Starlette, my first option would be, of course, to create a huge PR / set of PRs implementing it as part of Starlette.


Here's what I want to build.

First, my main point/concern is to achieve a specific set of features, that overlap a lot with what you did in APIStar.

I'm actually starting from the development experience I want to have, and then finding, mixing and creating the tools that would provide it.

It want it to have this:

  • Support for declaring OpenAPI 3, with all/most of its attributes. Not necessarily implementing all the logic, but allowing the schema to be declared. Actually, my plan is to build it around OpenAPI, using its keywords, etc.
  • Very simple and intuitive code API.
  • Great editor support. Autocomplete and type checking everywhere. For example, I want the endpoint route decorator to have explicit default declaration of name=None, description=None, **kwargs, etc. instead of just **kwargs. So that when I'm creating a route, I can get autocompletion and suggestions of the parameters I need to put instead of having to go and find the docs for that function to see what I can pass (part of having simple and intuitive API).
  • Have code / class / type based declarations. Not strings with a subformat (as happens with YAML in docstrings). I want to have very much like what APIStar was doing (I think it was brilliant).
  • Have data type declarations for path parameters, query parameters, headers, and body. For that, I want to use Pydantic, as it is based on the same Python type hints. So, whenever I get a value passed, it has the correct type inference, completion, etc. And Pydantic already has validation and serialization.
  • Declare in the route function's parameters the types of what it expects, and as default value, any additional meta-information. This is because most of the editors give code completion, checking etc, based on Python type hints, not on default values (or at least type hints are the first option). So I can get the correct data type that I'm using inside the function (with completion, type checking, etc) while being able to declare any additional metadata as the default value. Instead of having to declare that additionally somewhere else, in a decorator or something (this is the best I got up to now, and that's what I'm currently using, e.g. in my Flask API project generators: https://github.com/tiangolo/full-stack).
  • From that simple declaration in the route function's parameters get these multiple features:
    • autocomplete, type checks and docs (based on the type hints, with standard Python syntax)
    • validation, the final client would receive an error if the data doesn't match and can't be converted to the declared types. If he sent a string "asdf" in the path /api/users/{user_id} when the function was declared with user_id: int.
    • Automatic OpenAPI schema generation, from the same parameter declarations.
  • The body of a JSON would be declared as Pydantic classes, so it would be able to have validation even at sub-trees of a JSON document, like Marshmallow can do, but with Pydantic the final object properties have the correct type hints, so, great editor support. And of course, the function wouldn't receive as a parameter a dict, but the same Pydantic instance, and could interact with it as an object, accessing it's declared properties instead of having to remember what was the key of the dict he had to use, etc.
  • I don't really care that much about generation YAML. JSON is OK for me. I normally just start another container with Swagger UI.
  • It might integrate a dynamic Swagger UI, using an external script, if that's appealing to the project.

A "code colored fragment" is worth a thousand words. So, I want to be able to have something like this (part of my experiments):

from typing import List
from pydantic import BaseModel

class Pet(BaseModel):
    name: str
    age: int

class User(BaseModel):
    name: str
    pets: List[Pet]

class StatusMessage(BaseModel):
    message: str

@app.put("/users/{user_id}/")
def route_get_users(user_id: int, user: User) -> StatusMessage:
    print(user_id)
    print(user.pets[0].age)
    return {"message": "created!"}

@app.get("/users/{user_id}/pets/")
def route_get_users(user_id: int, skip: int, limit: int) -> List[Pets]:
    some_user: User = get_user(user_id)
    return some_user.pets

An example of how it could look like, and how the completion could work. See that the received user taken from a JSON body could have a list of pets, and even after indexing one of them, completion would still work (at least in some editors).

api

Latter I plan on adding an equivalent of (or similar to) APIStar components, to put things like JWT Token auth as more function parameters, that return the current user only if all the auth validation passed.


How to implement it:

  • I need to subclass Starlette's routing module classes, as there is where the final routing functions get decorated and where I could inspect the function signature, etc. So, I cannot make as just a Starlette middleware.
  • This additional router would be kind of tightly integrated with the definition of the final routes, would expect them to have some types of parameters and inspect them, so it wouldn't be a pure ASGI scope dict. (The first reason I think you won't want it here).
  • This would also require Pydantic for annotation, validation, declaration, etc. So another (possibly optional) dependency. (The second reason I think you won't want it here).

I'm still thinking about the best way to declare returned values, as if I declare a Pydantic class but actually return a dict, mypy will complain. And also I'm still unsure about the best way to declare the HTTP status codes. I still haven't defined how to finally do that, but even the current idea works (again, all inspired by APIStar).


So, what do you think?

Do you think any of all this would be a good fit to Starlette? Up to which features would you like to have here, if any?

@tomchristie
Copy link
Member Author

Sounds great! I think that most likely any integrations with Pydantic, Marshmallow, API Star, or anything like that ought to be as a third party package. It shouldn't really be in scope for Starlette to make opinionated choices at that kind of level. However I think we should aim to do a really good job of promoting third party packages from the main docs.

I wasn't clear how you'd indicate if a model is for query parameters or for the request body.

If there's extra information you want to put in there, then one way would be to use decorators, eg.

@app.get("/users/{user_id}/pets/")
@annotate(query_params=['skip', 'limit'])
def route_get_users(user_id: int, skip: int, limit: int) -> List[Pets]:
    ...

You'd want to use subclasses of Route and WebSocketRoute, to modify the dispatch, but you otherwise probably don't need to change the router implementation/interface.

We might want to think about how we can support that from add_route, too.

@tiangolo
Copy link
Member

tiangolo commented Nov 6, 2018

[...] anything like that ought to be as a third party package [...]

Cool! I agree. I'll keep it separated and add a PR to the docs once I have something 😄

[...] how you'd indicate if a model is for query parameters or for the request body [...]

I'll add the extra info as the default value. I see now that in my example I didn't include that. The same way Pydantic uses a Schema class to add more info to each field.

But I'm planning on adding some defaults as follows:

  • If the same param name is part of the path variables, as in user_id, then this param goes in the path
  • If the param is not part of the path and is a scalar, it's by default a query param
    • If the default value is None, or a "Schema" with a default of None, the param is optional
  • If the param is not a scalar and the method is a PUT or POST, expect it in the body, as JSON
    • If there are more than one non-scalar param, or the method is not PUT or POST, raise an error

For the rest of the locations (header, cookie, etc) and declarable metadata, or to override the defaults, or add more metadata, use default class instances, e.g.:

from some_package import QueryParameter

@app.get("/users/{user_id}/pets/")
def route_get_users(
    user_id: int, 
    skip: int = QueryParameter(None, title="Items to skip"), 
    limit: int = QueryParameter(None, title="Limit items to get", description="Limit the results to have only so many items, after skip."),
    ) -> List[Pets]:
    some_user: User = get_user(user_id)
    return some_user.pets

That way I can add everything related to one param together.

I think that's the most common error I make when working with decorators and declaring OpenAPI schemas and validation with them, as I have to put skip in the parameter and also in the decorator, sometimes (many) I forget synchronizing those two declarations. To me, it ends up being some form of "duplication" of code.

And Python could provide this funny way to avoid that and simplify the code, passing information via type hints (that are also used by the editor) and default values for meta information. And both can be inspected / introspected.


But also, having that default logic described above, many common cases could be easily coded and "just work", e.g.:

@app.put("/users/{user_id}/pets/")
def route_get_users(user_id: int, pet: Pet, make_favorite: bool = False) -> Pet:
    created_pet = put_pet(user_id, pet)
    return create_pet

In this case:

  • user_id is by default a path param
  • pet is by default a JSON body
  • make_favorite is by default a query param, with a default value of False
  • The response is a JSON body with the form of the Pet

And with that simple code, and simple syntax, we get:

  • The parameters for the function
  • OpenAPI declarations for all of them, including sub-fields of the JSON body
  • Validation for all of them, including sub-fields of the JSON body
  • Serialization of the result, including sub-fields...
  • Type hints and editor support, with completion even for sub-fields in the JSON body

It's like "killing a bunch of birds with one stone". And those features, with that simplicity, have several advantages (in my point of view) over all the other similar options I know (Python Flask-apispec, TypeScript Nest for NodeJS, yaml in docstring based alternatives, etc).

[...] use subclasses of Route and WebSocketRoute [...]

Yes! that's exactly what I'm doing. I'm glad to see you agree, it means I got it right then 😂

I'm also subclassing starlette.applications.Starlette, to add decorators for post, get, etc.

@tomchristie
Copy link
Member Author

Yes! that's exactly what I'm doing. I'm glad to see you agree, it means I got it right then

We'll probably want to add route_cls and websocket_cls to the Router implementation at some point to make it easy for you to hook in alternative implementations like this.

@tiangolo
Copy link
Member

tiangolo commented Nov 7, 2018

[...] add route_cls and websocket_cls to the Router implementation [...]

Good idea 👍


I'll see how my experiments go and report back (as a PR to docs with "third party projects" ) 😄

@perdy
Copy link
Contributor

perdy commented Nov 7, 2018

I'm doing some work regarding to type system and parameters validation based on API Star v0.5, so as you mentioned about routing I also needed to modify the Route class to allow inspect the functions parameters in order to do a validation. Anyway, as far as I tested it seems to fit well with Starlette, allowing the use of API Star components, types and validators.

It's on early development stages yet, but here you can found the code: starlette-api. I'll update the readme with some doc and examples that I've been using to test it.

I think we're aiming the same goal so we can discuss if we could join forces to get it :)

@tomchristie
Copy link
Member Author

@perdy Ace. Yup I'd also been thinking that Starlette's getting to the point that an API Star layer on top would work well. 😎

@woile
Copy link
Contributor

woile commented Nov 7, 2018

Hey ppl, what do you think about molten validation? field api
I find it super interesting because the api is quite similar to dataclasses. The field subclasses to be used are inferred based on the typehint I think. Example

# molten
from molten import field, schema
from typing import Optional

@schema
class Todo:
    id: Optional[int] = field(response_only=True)
    description: str
    status: str = field(choices=["todo", "done"], default="todo")
# python dataclasses 
from dataclasses import dataclass, field

@dataclass
class C:
    x: int
    y: int = field(repr=False)
    z: int = field(repr=False, default=10)
    t: int = 20

It would be really interesting to see an integration between this + starlette + openapi schema generation

What do you think?

@tomchristie
Copy link
Member Author

@woile Yup to any and all of the above. The important thing for Starlette to do is to ensure that it's easy to adapt to various endpoint styles. Any third party stuff that can be implemented on top of that is much welcomed.

@tiangolo
Copy link
Member

tiangolo commented Nov 8, 2018

Cool @perdy , I think we are thinking about similar solutions.

Although I'm more inclined to try using Pydantic, as its type declarations are based on the same "standard" Python types.

So a declaration in APIStar with:

class Product(types.Type):
    name = validators.String()
    rating = validators.Integer(minimum=1, maximum=5)

In Pydantic would be like:

class Product(BaseModel):
    name: str
    rating: int = Schema(..., gt=0, lt=6)

the difference is that editors will give support, checks, and completions for str for something like:

prod = Product()
prod.name

For example, the editor will give completions for lower(), capitalize(), etc.

I just checked, PyCharm also detects prod.rating as an int, VS Code didn't seem to detect it, at least yet.

Update: with the latest additions to Pydantic's Schema and declaring it as above, all editors will give full type support.


But I guess we are all trying to achieve something like what APIStar was achieving.


@woile Thanks for pointing it out! I think I hadn't seen Molten, it looks quite interesting, seems pretty close to what I want to have. I'll play with it.

I think we'll still benefit from having something like that but based on ASGI with Starlette, to get its massive performance (kudos to @tomchristie for these awesome tools and ideas).

But I'll definitively check how Molten works, the development experience and its source code 😁

BTW, Pydantic has a dataclass based system too, with a backport to make it work for Python 3.6 too:

from datetime import datetime
from pydantic.dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str = 'John Doe'
    signup_ts: datetime = None


user = User(id='42', signup_ts='2032-06-21T12:00')
print(user)
# > User(id=42, name='John Doe', signup_ts=datetime.datetime(2032, 6, 21, 12, 0))

@perdy
Copy link
Contributor

perdy commented Nov 9, 2018

@tiangolo I'm not specifically interested in use API Star types and validators, I just started with it because it sounded familiar to me. Pydantic seems elegant so, if we can get the same functionality that API Star types and validators provides I don't have any problem to move into it.

@tiangolo
Copy link
Member

Sorry for the delay coming back to this guys.

I had been busy implementing updates to Pydantic to support the things I needed:

  • We can now declare validation and constraints using Schema. So, editor support works perfectly everywhere.
  • And it now has full support for JSON Schema generation (which is the core of OpenAPI).

I also took quite some time reading, re-reading and studying the specs for JSON Schema, OpenAPI, etc.

And re-studying similar tools, as previous versions of APIStar, Molten, Flask-apispec (also based on Marshmallow), NestJS (but that is not even Python), etc.

And then, using all that to create FastAPI.

It has the following:

  • Fully based on Starlette for all the web parts.
  • Fully based on Pydantic for all the data parts.
  • Inspired by APIStar's typed parameters in routes.
  • Based on standard Python 3.6+ types that provide:
    • Editor and tooling support (obviously)
    • Data conversion (serialization, parsing)
    • Data validation, with automatic, clear validation errors
  • OpenAPI schema with JSON Schema declarations
  • Interactive docs with Swagger UI and ReDoc
  • JSON body declarations as Pydantic models
  • Dependency Injection
    • With arbitrarily deep dependency graphs (trees), as dependencies can themselves declare more dependencies
  • Security utilities, to integrate authentication / authorization (docs still incomplete here)
    • Including OAuth2, with JWT tokens (docs still incomplete here)
  • Route merging, to support big multi-file applications (docs still incomplete here)

I still need to:

  • raise the coverage to 100%
  • finish some small incomplete sections of the docs (I have been working a lot on all the "tutorial - user guide" the last 2 days)
  • some other minor tasks

...but I thought that at this point I can start sharing it with you guys 😁 🎉


Now, question related to Starlette, before I stop hijacking this issue, should I make a PR to add it to Starlette's docs?

I don't see any "third party packages" section or similar. Should I propose one or where do you think it should go @tomchristie ?

I guess it could include:

@woile 's https://github.com/Woile/starlette-apispec
@perdy 's https://github.com/PeRDy/starlette-api
@kennethreitz 's https://github.com/kennethreitz/responder
https://github.com/tiangolo/fastapi
(I don't know what else)

@woile
Copy link
Contributor

woile commented Dec 15, 2018

Wow tiangolo, that's a great addition to the Starlette community. Awesome work. I'll explore it more during the week.

I was wondering, are you thinking about adding a pydyntic plugin for APISpec?
Potentially you could delegate the schema generation to APISpec (if you are not already doing so)

@tiangolo
Copy link
Member

Wow tiangolo, that's a great addition to the Starlette community. Awesome work. I'll explore it more during the week.

Awesome!

I was wondering, are you thinking about adding a pydyntic plugin for APISpec?
Potentially you could delegate the schema generation to APISpec (if you are not already doing so)

Actually, before getting to this point, I was using Flask-apispec, which is based on APISpec, Marshmallow and Webargs. All of them built by the same guys. And it is awesome for pre-types code (Flask). I actually built a couple of project generators around them, which is the main stack I was using up to now.

But now having types in Python 3.6+, I ended up preferring Pydantic that uses them directly, instead of Marshmallow (used by APISpec).

Also APISpec doesn't support OpenAPI 3.0 yet, it was initially built for Swagger 2.0 (OpenAPI 2.0). I'm starting FastAPI from OpenAPI 3.0.3 (which is the current version).

And also, I'm integrating all the type declarations everywhere, to do the validations and generate the OpenAPI schema.

So, for example, only when there are parameters declared in the functions I include the schemas for validation errors.

And only when there are security dependencies I include the schemas and OpenAPI "security schemes".


If you want to test a quick "ready-made", project generator (with Docker, Couchbase, Vue, etc), I made this one (based on the previous ones I had for Flask-apispec): https://github.com/tiangolo/full-stack-fastapi-couchbase

@tomchristie
Copy link
Member Author

I don't see any "third party packages" section or similar. Should I propose one

Yes please!

@woile
Copy link
Contributor

woile commented Dec 16, 2018 via email

@tomchristie
Copy link
Member Author

Yup agreed. We can have a “frameworks” section as one part of that.

@tomchristie tomchristie added the clean up Refinement to existing functionality label Dec 18, 2018
JoseKilo added a commit to JoseKilo/starlette that referenced this issue Feb 19, 2019
tomchristie pushed a commit that referenced this issue Feb 19, 2019
* Include mounted paths in schemas (part of #172)

* Remove unnecessary indirection

* Refactor: cleaner interface, return a dict always
@3x10RaiseTo8

This comment was marked as spam.

@encode encode locked as resolved and limited conversation to collaborators Jul 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
clean up Refinement to existing functionality
Projects
None yet
Development

No branches or pull requests

5 participants