Allow to manipulate request body, headers and method before a request is proxied #2

Elettrotecnica · 2024-11-28T17:20:37Z

In this PR I implement some extra flag for the upstream api that allow to send a request to a backend where the request body and headers are not those from the current connection, but completely arbitrary.

This allows to create wrappers to web apis that can return a streaming response, such as https://github.com/ollama/ollama/blob/main/docs/api.md

In my wrapper I can then differentiate between:

"normal" calls -> build a request, send it to the endpoint via e.g. ns_http, return the reply
streaming calls -> build the request, send it to the backend, let the backend serve the reply through us as proxy

The previous behavior should be unchanged by this new feature.

Many thanks for any feedback on this.

gustafn · 2024-11-29T07:42:26Z

I have trouble to understand the terminology: what is a "normal" and a "not normal" request? By their nature, the ns_connchan requests are streaming requests, as they are not fully receiving a request but following an i/o driven model. The ns_http based approach (as we have it now) has to receive the full request.

From the description, I do not understand what you are trying to achieve. I got a rough understanding of the intentions by looking at the ollama "streaming", but the details are also missing there. Is the "streaming" referring here to HTML-streaming (i.e. long-running requests, getting piecewise information). This is already handled to a certain degree in revproxy, causing e.g. a change in delivery method from ns_http to ns_connchan.

What do you mean by "build the request"? You refer to the request header? Do you perform request splitting?

There are several changes in the PR that have nothing to do with the topic. Can you produce multiple PR independent PRs from this, or do i have to do some cherry-picking, before we get to the essentials?

Elettrotecnica · 2024-12-02T14:34:15Z

Is the "streaming" referring here to HTML-streaming (i.e. long-running requests, getting piecewise information).

Yes, in the case of streaming requests, ollama will keep the request opened untill all chunks of the reply have been sent, each as separate JSON objects.

From the description, I do not understand what you are trying to achieve.

I want to open an HTTP request to the ollama backend with a "streaming" reply. I do not want to expose the backend "as-is" via regular proxy, but via a wrapper class that will take care of building the request (body, headers and so on). The request should afterwards be handled exactly as it it was a proxied request, e.g. the frontent and backend channels, callbacks, error handling and all of the things revproxy does.

Currently, reproxy will always assume it is the current request, headers and body that is being sent to the backend. My changes try to make so that one can craft the request to be proxied instead.

IMO the changes are on topic, in the sense that they only introduce intended feature in the api. I have moved some logics from the specific backend code to the generic api, because they were doing a lot of the same stuff.

Hope it is clearer what I am trying to achieve. One way to not need any of this would be that ns_http would support streaming responses, with e.g. a callback to be called whenenver data is there, similar to what the tcl http module does (see https://www.tcl.tk/man/tcl8.6/TclCmd/http.htm#M23)

Elettrotecnica · 2024-12-02T15:04:06Z

One way to not need any of this would be that ns_http would support streaming responses, with e.g. a callback to be called whenenver data is there, similar to what the tcl http module does (see https://www.tcl.tk/man/tcl8.6/TclCmd/http.htm#M23)

In fact this is only partially true: even with a callback like this, I would still need to set up my own streaming delivery of the response on the frontend. Proxying the reply has the advantage that it is the backend doing the work.

gustafn · 2024-12-16T16:00:19Z

Why have you dropped silently the binary handling?

Elettrotecnica · 2024-12-16T16:20:31Z

I took it out of the specific implementations and put it in the revproxy-procs.tcl file. The intention was to factor the behavior out, as I am introducing that one can specify their own content.

gustafn · 2024-12-18T17:13:06Z

I still don't understand of what you want to achieve in detail. HTTP streaming was already supported before your changes - and it is used daily, e.g. on openacs.org. My guess is that you were fighting with timeouts. I have made the HTML streaming implementation more robust against non-working configurations by changing the timeouts on the fly when HTTP streaming was detected.

I have tested the implementation also again ollama

Without reverse proxy:

% curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?"
}'

Starting the reverse proxy with the configuration file from the revproxy README on github

nsd_revproxy_target=http://localhost:11434 /usr/local/ns/bin/nsd -f -u nsadmin -g nsadmin -t /usr/local/ns/conf/nsd-config-revproxy.tcl

Running the curl request via the reverse proxy:

% curl http://localhost:48000/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?"
}'

Is there something else to test?

Elettrotecnica added 3 commits November 28, 2024 17:32

Allow to forwarded a completely handcrafted request to the backend

7eb36e6

The method is already passed to the api

1a52ba7

Fixy typo

f5940f3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow to manipulate request body, headers and method before a request is proxied #2

Allow to manipulate request body, headers and method before a request is proxied #2

Uh oh!

Elettrotecnica commented Nov 28, 2024

Uh oh!

gustafn commented Nov 29, 2024

Uh oh!

Elettrotecnica commented Dec 2, 2024

Uh oh!

Elettrotecnica commented Dec 2, 2024

Uh oh!

gustafn commented Dec 16, 2024

Uh oh!

Elettrotecnica commented Dec 16, 2024

Uh oh!

gustafn commented Dec 18, 2024

Uh oh!

Uh oh!

Allow to manipulate request body, headers and method before a request is proxied #2

Are you sure you want to change the base?

Allow to manipulate request body, headers and method before a request is proxied #2

Uh oh!

Conversation

Elettrotecnica commented Nov 28, 2024

Uh oh!

gustafn commented Nov 29, 2024

Uh oh!

Elettrotecnica commented Dec 2, 2024

Uh oh!

Elettrotecnica commented Dec 2, 2024

Uh oh!

gustafn commented Dec 16, 2024

Uh oh!

Elettrotecnica commented Dec 16, 2024

Uh oh!

gustafn commented Dec 18, 2024

Uh oh!

Uh oh!