-
Notifications
You must be signed in to change notification settings - Fork 2
Allow to manipulate request body, headers and method before a request is proxied #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Allow to manipulate request body, headers and method before a request is proxied #2
Conversation
I have trouble to understand the terminology: what is a "normal" and a "not normal" request? By their nature, the ns_connchan requests are streaming requests, as they are not fully receiving a request but following an i/o driven model. The ns_http based approach (as we have it now) has to receive the full request. From the description, I do not understand what you are trying to achieve. I got a rough understanding of the intentions by looking at the ollama "streaming", but the details are also missing there. Is the "streaming" referring here to HTML-streaming (i.e. long-running requests, getting piecewise information). This is already handled to a certain degree in revproxy, causing e.g. a change in delivery method from ns_http to ns_connchan. What do you mean by "build the request"? You refer to the request header? Do you perform request splitting? There are several changes in the PR that have nothing to do with the topic. Can you produce multiple PR independent PRs from this, or do i have to do some cherry-picking, before we get to the essentials? |
Yes, in the case of streaming requests, ollama will keep the request opened untill all chunks of the reply have been sent, each as separate JSON objects.
I want to open an HTTP request to the ollama backend with a "streaming" reply. I do not want to expose the backend "as-is" via regular proxy, but via a wrapper class that will take care of building the request (body, headers and so on). The request should afterwards be handled exactly as it it was a proxied request, e.g. the frontent and backend channels, callbacks, error handling and all of the things revproxy does. Currently, reproxy will always assume it is the current request, headers and body that is being sent to the backend. My changes try to make so that one can craft the request to be proxied instead. IMO the changes are on topic, in the sense that they only introduce intended feature in the api. I have moved some logics from the specific backend code to the generic api, because they were doing a lot of the same stuff. Hope it is clearer what I am trying to achieve. One way to not need any of this would be that ns_http would support streaming responses, with e.g. a callback to be called whenenver data is there, similar to what the tcl http module does (see https://www.tcl.tk/man/tcl8.6/TclCmd/http.htm#M23) |
In fact this is only partially true: even with a callback like this, I would still need to set up my own streaming delivery of the response on the frontend. Proxying the reply has the advantage that it is the backend doing the work. |
Why have you dropped silently the binary handling? |
I took it out of the specific implementations and put it in the revproxy-procs.tcl file. The intention was to factor the behavior out, as I am introducing that one can specify their own content. |
I still don't understand of what you want to achieve in detail. HTTP streaming was already supported before your changes - and it is used daily, e.g. on openacs.org. My guess is that you were fighting with timeouts. I have made the HTML streaming implementation more robust against non-working configurations by changing the timeouts on the fly when HTTP streaming was detected. I have tested the implementation also again ollama Without reverse proxy:
Starting the reverse proxy with the configuration file from the revproxy README on github
Running the curl request via the reverse proxy:
Is there something else to test? |
In this PR I implement some extra flag for the upstream api that allow to send a request to a backend where the request body and headers are not those from the current connection, but completely arbitrary.
This allows to create wrappers to web apis that can return a streaming response, such as https://github.com/ollama/ollama/blob/main/docs/api.md
In my wrapper I can then differentiate between:
The previous behavior should be unchanged by this new feature.
Many thanks for any feedback on this.