Byok rework #611

humbertoyusta · 2025-03-17T11:02:44Z

Let's don't merge it yet

mitya52 · 2025-03-26T09:55:53Z

"Refact/deepseek-reasoner": {
      "n_ctx": 64000,
      "name": "deepseek-reasoner",
      "id": "Refact/deepseek-reasoner",

name: is it resolve_as/inference_id? actual model's inference name
id: looks like duplicate, we can use the key as id

mitya52 · 2025-03-26T09:57:39Z

  "embedding_model": {
    "n_ctx": 0,
    "name": "thenlper/gte-base",
    "id": "Refact/thenlper/gte-base",
    "support_metadata": true,
    "embedding_size": 768,
    "rejection_threshold": 0.25,
    "embedding_batch": 64
  },

n_ctx is 0 by some reason (should be 512 as I know)
why only one model? what if you have multiple?

mitya52 · 2025-03-26T09:59:54Z

  "code_completion_default_model": "qwen2.5/coder/1.5b/base",
  "multiline_code_completion_default_model": "",
  "code_chat_default_model": "gpt-4o",

looks like the name in the terms of ModelRecord, but should be id (or key)

humbertoyusta · 2025-03-26T10:11:48Z

  "embedding_model": {
    "n_ctx": 0,
    "name": "thenlper/gte-base",
    "id": "Refact/thenlper/gte-base",
    "support_metadata": true,
    "embedding_size": 768,
    "rejection_threshold": 0.25,
    "embedding_batch": 64
  },

n_ctx is 0 by some reason (should be 512 as I know)
why only one model? what if you have multiple?

Why would we need more than one embedding model? if we change it we'll need to create a new vecdb, and this is the only use for embedding model right now, but if you see any way we could use different embedding models for some reason maybe it's better to make it indexmap like others for future-proofing

I did not found where we used n_ctx for embedding models everywhere, but didn't want to move it outside of base model rec to make some things easier in the code, still, I'll add it to those two to not have 0 there for the known models, but it's not that important to fill that for all models

mitya52 · 2025-03-26T10:13:06Z

in caps.rs

#[serde(default, skip_serializing)]
    pub similar_models: Vec<String>,

lets remove it, we want flat list without any weird stuff
similar_models should is for backward capability with KNOWN_MODELS and for old caps style (without direct known models specification)

#[derive(Debug, Serialize, Clone, Deserialize, Default)]
pub struct ScratchpadSupport {
    #[serde(default, rename = "supports_scratchpads")]
    pub supported: HashMap<String, Value>,
    #[serde(default, rename = "default_scratchpad")]
    pub default: String,
}

actually we dont have models with multiple scratchpads (or nobody use it)
it will be better to leave only scratchpad record itself

    #[serde(skip_deserializing)]
    pub embedding_model: EmbeddingModelRecord,

in server we can have 2 embedding models (pgu and cpu)
it's not the problem, we can get first listed
but I don't see the reason why we can't make a list of embedding models and default one (like for completion and chat)

Looks like pub struct CapsProvider is the old caps style (cloud etc.)?
Let's rename it to CapsLegacy or some

humbertoyusta · 2025-03-26T10:14:39Z

code_completion_default_model

  "code_completion_default_model": "qwen2.5/coder/1.5b/base",
  "multiline_code_completion_default_model": "",
  "code_chat_default_model": "gpt-4o",
looks like the name in the terms of ModelRecord, but should be id (or key)

Inside one provider (server caps, cloud caps, or yaml files in providers.d, it's name, then lsp makes it "provider/model_name" when it loads the caps of all providers, if you found this in the merged caps that lsp makes, then it's a bug

humbertoyusta · 2025-03-26T10:23:11Z

You're right about one scratchpad per model, I'll change that to make model record simpler, just scratchpad_name (string) and scratchpad_value (json value) looks better.

For embedding models, it's better to use same as other I think from what you talked, and we may use more than one in the future, better to make same logic as for others I guess, I'll change that.

If similar_models is not needed for compatibility in self-host/enterprise then I'll get rid of it, it will make things simpler, we don't need it for cloud or byok

mitya52 · 2025-03-26T10:59:55Z

one more

113718.674 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
113818.676 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
113918.678 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114018.679 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114118.681 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114218.683 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114318.686 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114418.688 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114518.690 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114618.692 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114718.695 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114818.697 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114912.602 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)

annoying errors (which is not error btw)

mitya52 · 2025-03-26T11:08:14Z

if you found this in the merged caps that lsp makes, then it's a bug

in the http://localhost:8001/v1/caps

similar_models ... we don't need it for cloud or byok

we need it for backward compat with old caps / byok style, but of course not for new internal caps format

Why would we need more than one embedding model?

for example I want to use openai embedder instead of gte (byok)

about embedding model: it has n_ctx as max input tokens, we should have this limitation inside the code (or we'll have the issues, I'm sure)

* un-disable input when limit reached. * chore: add `compression_strength` to tool messages * add paused state to thread * add hook to pause auto send based on compression. * ui: let the user know that their chat is being compressed. * fix: linter issues after removing `limitReached` information call out. * fix: also use `/links` to decided if a new chat should be suggested. * refactor: remove `useTotalTokenUsage` hook. * add comments about `newChatSuggested`. * pause and unpaused using newChatSuggested. * fix(NewChatSuggested): use a hook to get the compression strength. * feat: add second condition for pausing the chat. * case: it might be posable for many messages with out compression.

…ting them to last assistant message

it was only used in telemetry, and it was replaced by IntoResponse

* wip: remove attach file checkbox. * feat: attach files button. * test: active file is no longer attached by default. * add an event for the ide to attach a file to chat. * fix: remove attached files after submit.

olegklimov · 2025-04-25T09:56:54Z

refact-agent/engine/src/yaml_configs/default_providers/anthropic.yaml

+  - claude-3-5-sonnet-latest
+  - claude-3-5-haiku-latest
+
+model_defaults:


olegklimov · 2025-04-25T09:59:42Z

refact-agent/engine/src/caps/mod.rs

+#[derive(Debug, Serialize, Deserialize, Clone, Default)]
+pub struct CodeAssistantCaps {
+    #[serde(deserialize_with = "normalize_string")]
+    pub cloud_name: String,


// "refact", "refact_self_hosted"

…eople don't confuse it with instruct

…se if stream was DONE

…middleware.ts

…lf hosted caps to lsp internal caps

This reverts commit 860578a.

* add metadata in caps * wip: add coin count to history item * add cost info to caps select * wip: replace message with coins in menu. * wip: handle out of coins error. * Byok rework (#611) * ui: add better ui for caps cost. * wip: history item, calculate cost. * remove usage editing in lsp * wip: calculate total price in history. * refactor: display coin usage * chore: update caps stubs. * chore(coins): remove old agent usage logic * refactor(startPollingForUser): rename useAgentUsage. * feat: add warning when balance starts to get low. * fix: add metadata to caps after moving caps * fix: re-add metadata to caps. * fix: caps type guard. * fix: <p> being a descendant of <p> * fix typo: ballance -> balance * fix: typo `ballance` * small change to trigger actions. * fix: caps pricing maybe undefined when using byok * fix type check for pricing being undefined when using byok. * fix: cost * usage counter should be two decimal points. * update calculate total cost. * missed toFixed on total cost. * round total cost. --------- Co-authored-by: Kirill Starkov <[email protected]> Co-authored-by: Humberto Yusta Gómez <[email protected]>

humbertoyusta force-pushed the byok-rework branch 2 times, most recently from 7de8829 to c15ac59 Compare March 20, 2025 22:46

MarcMcIntosh added 5 commits March 25, 2025 19:47

fix: limit the number of tokens a chat can use.

4f390cf

fix: linter errors.

29982f3

fix: dispatch information callout and disable input.

7d7364d

fix: missing dep issue.

d56dc0b

refactor: usage total.

247dced

humbertoyusta force-pushed the byok-rework branch from f7197b3 to e4bb5f8 Compare March 25, 2025 21:23

MarcMcIntosh and others added 3 commits March 26, 2025 17:05

feat: showing a placeholder message to the user if history is empty

637b446

fix: intercepting currentUsage when choices are empty array and upser…

eb40ed6

…ting them to last assistant message

humbertoyusta force-pushed the byok-rework branch from ebd5da4 to fb86e2d Compare March 27, 2025 21:38

MDario123 and others added 7 commits March 28, 2025 19:11

refactor: change telemetry from macro to middleware

a3d6c6b

refactor: /ping and /caps handlers

80ebb3b

refactor: code_completion_post_validate don't consume CodeCompletionPost

d48ab5b

chore: remove ScratchError.to_response unused function

b1f3c53

it was only used in telemetry, and it was replaced by IntoResponse

fix: add back "client will see" message

ce12f6d

doc: clarification about extension in scratch error into_response

691ba39

wip: remove attach file checkbox. (#644)

72d8090

* wip: remove attach file checkbox. * feat: attach files button. * test: active file is no longer attached by default. * add an event for the ide to attach a file to chat. * fix: remove attached files after submit.

humbertoyusta force-pushed the byok-rework branch from 8afcda3 to 3f65277 Compare March 31, 2025 09:24

Add timeout to aiohttp session and handle JSON parsing errors

c52bbf1

olegklimov reviewed Apr 25, 2025

View reviewed changes

olegklimov previously approved these changes Apr 25, 2025

View reviewed changes

humbertoyusta added 2 commits April 25, 2025 13:36

rename to model_default_settings_ui, add comments

6fd0ef1

remove trailing whitespaces

8dbd041

humbertoyusta dismissed olegklimov’s stale review via 8dbd041 April 25, 2025 11:47

humbertoyusta and others added 9 commits April 25, 2025 16:37

rename model family to qwen2.5-base, and added default to ollama so p…

2b99933

…eople don't confuse it with instruct

remove completion default model from provider dto

ddc0787

fix: removed completion_default_model from the Provider type

5d653cc

fix: usage retrieval adjustment & setting waiting_for_response to fal…

dbb390f

…se if stream was DONE

fix: reworked how errors are being sent and parsed in actions.ts and …

68b9f79

…middleware.ts

chore: remove nested ifs in utils.ts

565ccae

fix: fixed missed key issue in ModelCard

677d79c

chore: resolved issue with missing key of iteratable element

271e387

chore: removed Fragment, early returning null if option is not visible

e8ea705

alashchev17 previously approved these changes Apr 25, 2025

View reviewed changes

fix: non passthrough models shouldn't use /v1/chat/completions

467a3aa

humbertoyusta dismissed alashchev17’s stale review via 467a3aa April 25, 2025 17:15

humbertoyusta added 3 commits April 25, 2025 23:07

self hosted models are visible in UI, refactor logic of converting se…

3bf339a

…lf hosted caps to lsp internal caps

fix: chat default model as chat light model for self hosted

7674dff

refactor: move caps to caps.rs inside caps module

8b4b6ad

alashchev17 approved these changes Apr 25, 2025

View reviewed changes

alashchev17 merged commit 860578a into dev Apr 25, 2025
9 checks passed

alashchev17 deleted the byok-rework branch April 25, 2025 22:17

alashchev17 restored the byok-rework branch April 29, 2025 13:30

alashchev17 added a commit that referenced this pull request Apr 29, 2025

Revert "Byok rework (#611)"

5b9ee14

This reverts commit 860578a.

alashchev17 mentioned this pull request Apr 29, 2025

Revert "Byok rework" #716

Closed

alashchev17 added a commit that referenced this pull request Apr 29, 2025

Revert "Byok rework (#611)"

447ffee

This reverts commit 860578a.

alashchev17 deleted the byok-rework branch April 29, 2025 15:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Byok rework #611

Byok rework #611

Uh oh!

humbertoyusta commented Mar 17, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025 •

edited

Loading

Uh oh!

olegklimov Apr 25, 2025

Uh oh!

olegklimov Apr 25, 2025

Uh oh!

Uh oh!

Uh oh!

Byok rework #611

Byok rework #611

Uh oh!

Conversation

humbertoyusta commented Mar 17, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

in caps.rs

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

humbertoyusta commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025

Uh oh!

mitya52 commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olegklimov Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

olegklimov Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mitya52 commented Mar 26, 2025 •

edited

Loading