Skip to content

Export some functions as FFI #3932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vadim2404 opened this issue Mar 5, 2025 · 16 comments
Open

Export some functions as FFI #3932

vadim2404 opened this issue Mar 5, 2025 · 16 comments
Labels
idea Needs of discussion to become an enhancement, not ready for implementation

Comments

@vadim2404
Copy link

Problem

Hey, I'm Vadim from Neon. And I have one tricky idea in mind which I want to test. I want to embed Postgrest into our proxy component. What is actually needed is if it is possible to export some functions from your codebase as FFIs and compile a library that I can link in my rust code.

I've already tried some simple Haskell functions, which I can call from the Rust.

I personally need the thing that converts HTTP requests into SQL. I can build a request object in Rust and pass it to Postgrest, and I would love to receive a response.

@steve-chavez wdyt about it?

@steve-chavez
Copy link
Member

Hey @vadim2404,

I personally need the thing that converts HTTP requests into SQL.

The need for exporting the HTTP -> SQL module has also come up before in the context of WASM (#2452, essentially to use postgREST with pglite), so I think it's a good idea.

I believe we would need to export our Plan.hs (internal AST) and Query.hs (generates SQL) modules as Haskell libraries to support this cleanly.

@steve-chavez steve-chavez added the idea Needs of discussion to become an enhancement, not ready for implementation label Mar 5, 2025
@vadim2404
Copy link
Author

the problem is that I'm not a Haskell developer, and it will take a while for me to get into that, but I would love to contribute (at least prepare some tests that check such FFIs)

@taimoorzaeem
Copy link
Collaborator

@vadim2404 As a haskell developer who hasn't worked with much FFIs yet, I would love to collaborate on this once we have a workable design for this feature.

@vadim2404
Copy link
Author

vadim2404 commented Mar 5, 2025

nice, can you just email me? and we can discuss the next steps?

I removed my email from that message because agreed to continue discussion async

@taimoorzaeem
Copy link
Collaborator

@vadim2404 I think a public discussion would be much better for the PostgREST community.

@vadim2404
Copy link
Author

it's a valid point.

how I see it. What I expect from my app: I can pass you the HTTP method, query params, body, and headers (if it makes any sense), and expect to receive a valid SQL. I don't want to let you know about real database connection or schema, I want to keep it generic.

In terms of FFI, we can define a struct Request and function that accepts a Request object and returns a C-string. Something like this

// Define the Request struct
typedef struct {
    char *method;  // HTTP method (GET, POST, etc.)

    char **query_params;  // Array of key-value pairs (each entry is array[2])
    uint32_t query_count;  // Number of query params

    uint8_t *body;  // Raw body data
    uint32_t body_length;  // Length of the body

    char **headers;  // Array of key-value pairs (each entry is array[2])
    uint32_t header_count;  // Number of headers
} Request;

char* request_to_sql(Request *req);

About the Request struct, especially query_params and body. I also think that you can export structures to make Request object cleaner.

@steve-chavez
Copy link
Member

steve-chavez commented Mar 6, 2025

and expect to receive a valid SQL

Given a request GET /projects?id=in.(1,2,3)&name=eq.IOS, PostgREST will generate a prepared statement like:

SELECT "test"."projects".* FROM "test"."projects" 
WHERE  "test"."projects"."id" = ANY ($1)  AND  "test"."projects"."name" = $2

I assume you will want the positional parameter values too? (Otherwise you would not reuse the parsing) If so, we would need to return a struct that carries the generated SQL plus the values with their positions.

and returns a C-string

A problem with this is that our SQL generation is not independent from our PostgreSQL driver (Hasql), which doesn't expose a ByteString that can be converted to CString, at least not cleanly. To lift this restriction, I think we need to solve #3934 first.

I don't want to let you know about real database connection or schema, I want to keep it generic.

PostgREST does need knowledge of the schema, see schema cache in architecture. So you'd need to get this schema cache before reaching the HTTP -> SQL translation.

This is the list of inputs to start the process:

-- | Examines HTTP request and translates it into user intent.
userApiRequest :: AppConfig -> Request -> RequestBody -> SchemaCache -> Either ApiRequestError ApiRequest

I can pass you the HTTP method, query params, body, and headers (if it makes any sense),
In terms of FFI, we can define a struct Request and function that accepts a Request object

The above Haskell Request object has a similar structure. The AppConfig and SchemaCache would still need to be filled.

@vadim2404
Copy link
Author

I assume you will want the positional parameter values too? (Otherwise you would not reuse the parsing) If so, we would need to return a struct that carries the generated SQL plus the values with their positions.

I guess it's fine to have a prepared statements, I can solve it on my side

A problem with this is that our SQL generation is not independent from our PostgreSQL driver (Hasql), which doesn't expose a ByteString that can be converted to CString, at least not cleanly. To lift this restriction, I think we need to solve #3934 first.

can it be exported as any other structure that I can call from C code?

The above Haskell Request object has a similar structure. The AppConfig and SchemaCache would still need to be filled.

can we somehow make it more generic and remove the dependency on AppConfig and SchemaCache? I mean just generate SQL based on inputs w/o validation


I also noticed that I need to send you the table name.

@wolfgangwalther
Copy link
Member

The above Haskell Request object has a similar structure. The AppConfig and SchemaCache would still need to be filled.

can we somehow make it more generic and remove the dependency on AppConfig and SchemaCache? I mean just generate SQL based on inputs w/o validation

We can not create SQL queries without our AppConfig or SchemaCache. If you think it should be possible in your case, then you might want to do something that we don't really see, yet.

I think you might need to expand on your use-case. What are you actually trying to achieve?

@vadim2404
Copy link
Author

I have something like this:

Proxy -> Postgres

I want to expose a Postgrest-compatible API on this proxy level. It means that I want to convert the REST payload into SQL and send this query to Postgres, which stays behind this proxy.

In our architecture, we have one proxy that can be connected to multiple Postgres instances (literally, thousands).

And if I could somehow call part of Postgrest w/o spinning it up for each Postgres instance, it'd be an awesome thing!

@wolfgangwalther
Copy link
Member

I want to expose a Postgrest-compatible API

PostgREST depends on knowing the target schema. That's why we have the Schema Cache. Without this information, we can not do the translation between REST and SQL.

multiple Postgres instances (literally, thousands).

This means you need to cache metadata of 1000s of Schemas and keep it in sync when the schema changes ...

Postgrest w/o spinning it up for each Postgres instance

... at which point the question is whether you are really going to save as much by not spinning up separate instances?


It sounds like you're basically interested in making PostgREST scale better. Maybe rather work towards that goal for PostgREST itself?

@vadim2404
Copy link
Author

Ideally, I want to avoid any state in the proxy service because I want to keep it stateless (in order to continue serving thousands for Postgres behind it). For this purpose, I don't want to cache any metadata there.

PostgREST depends on knowing the target schema. That's why we have the Schema Cache. Without this information, we can not do the translation between REST and SQL.

I do understand the current state, but is it possible to unwind it? Because I'm ok to fail with error that table does not exist, for instance.

It sounds like you're basically interested in making PostgREST scale better. Maybe rather work towards that goal for PostgREST itself?

I didn't think about scaling PostgREST, my initial thought was to grab some part of it to cover my use case.

@wolfgangwalther
Copy link
Member

I do understand the current state, but is it possible to unwind it? Because I'm ok to fail with error that table does not exist, for instance.

I don't think we're anywhere close to being able to do that, no. We depend on the schema cache for all kinds of stuff. Embedding, RPCs, etc.

@vadim2404
Copy link
Author

got it. Nonetheless, thanks for looking into that

@vadim2404
Copy link
Author

By the way, maybe you have a specification for how you parse things?

@taimoorzaeem
Copy link
Collaborator

taimoorzaeem commented Mar 11, 2025

@vadim2404 Yes, having something like a BNF? grammar for pgrst queries would be very helpful. For now I think you'd have to hack into PostgREST internals to see how we are doing it. As docs say, it is a work in progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Needs of discussion to become an enhancement, not ready for implementation
Development

No branches or pull requests

4 participants