Skip to content

Byok rework #611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 210 commits into from
Apr 25, 2025
Merged

Byok rework #611

merged 210 commits into from
Apr 25, 2025

Conversation

humbertoyusta
Copy link
Contributor

Let's don't merge it yet

@humbertoyusta humbertoyusta force-pushed the byok-rework branch 2 times, most recently from 7de8829 to c15ac59 Compare March 20, 2025 22:46
@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

"Refact/deepseek-reasoner": {
      "n_ctx": 64000,
      "name": "deepseek-reasoner",
      "id": "Refact/deepseek-reasoner",
  • name: is it resolve_as/inference_id? actual model's inference name
  • id: looks like duplicate, we can use the key as id

@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

  "embedding_model": {
    "n_ctx": 0,
    "name": "thenlper/gte-base",
    "id": "Refact/thenlper/gte-base",
    "support_metadata": true,
    "embedding_size": 768,
    "rejection_threshold": 0.25,
    "embedding_batch": 64
  },
  • n_ctx is 0 by some reason (should be 512 as I know)
  • why only one model? what if you have multiple?

@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

  "code_completion_default_model": "qwen2.5/coder/1.5b/base",
  "multiline_code_completion_default_model": "",
  "code_chat_default_model": "gpt-4o",

looks like the name in the terms of ModelRecord, but should be id (or key)

@humbertoyusta
Copy link
Contributor Author

  "embedding_model": {
    "n_ctx": 0,
    "name": "thenlper/gte-base",
    "id": "Refact/thenlper/gte-base",
    "support_metadata": true,
    "embedding_size": 768,
    "rejection_threshold": 0.25,
    "embedding_batch": 64
  },
  • n_ctx is 0 by some reason (should be 512 as I know)
  • why only one model? what if you have multiple?

Why would we need more than one embedding model? if we change it we'll need to create a new vecdb, and this is the only use for embedding model right now, but if you see any way we could use different embedding models for some reason maybe it's better to make it indexmap like others for future-proofing

I did not found where we used n_ctx for embedding models everywhere, but didn't want to move it outside of base model rec to make some things easier in the code, still, I'll add it to those two to not have 0 there for the known models, but it's not that important to fill that for all models

@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

in caps.rs

#[serde(default, skip_serializing)]
    pub similar_models: Vec<String>,

lets remove it, we want flat list without any weird stuff
similar_models should is for backward capability with KNOWN_MODELS and for old caps style (without direct known models specification)


#[derive(Debug, Serialize, Clone, Deserialize, Default)]
pub struct ScratchpadSupport {
    #[serde(default, rename = "supports_scratchpads")]
    pub supported: HashMap<String, Value>,
    #[serde(default, rename = "default_scratchpad")]
    pub default: String,
}

actually we dont have models with multiple scratchpads (or nobody use it)
it will be better to leave only scratchpad record itself


    #[serde(skip_deserializing)]
    pub embedding_model: EmbeddingModelRecord,

in server we can have 2 embedding models (pgu and cpu)
it's not the problem, we can get first listed
but I don't see the reason why we can't make a list of embedding models and default one (like for completion and chat)


Looks like pub struct CapsProvider is the old caps style (cloud etc.)?
Let's rename it to CapsLegacy or some

@humbertoyusta
Copy link
Contributor Author

code_completion_default_model

  "code_completion_default_model": "qwen2.5/coder/1.5b/base",
  "multiline_code_completion_default_model": "",
  "code_chat_default_model": "gpt-4o",

looks like the name in the terms of ModelRecord, but should be id (or key)

Inside one provider (server caps, cloud caps, or yaml files in providers.d, it's name, then lsp makes it "provider/model_name" when it loads the caps of all providers, if you found this in the merged caps that lsp makes, then it's a bug

@humbertoyusta
Copy link
Contributor Author

You're right about one scratchpad per model, I'll change that to make model record simpler, just scratchpad_name (string) and scratchpad_value (json value) looks better.

For embedding models, it's better to use same as other I think from what you talked, and we may use more than one in the future, better to make same logic as for others I guess, I'll change that.

If similar_models is not needed for compatibility in self-host/enterprise then I'll get rid of it, it will make things simpler, we don't need it for cloud or byok

@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

one more

113718.674 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
113818.676 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
113918.678 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114018.679 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114118.681 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114218.683 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114318.686 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114418.688 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114518.690 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114618.692 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114718.695 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114818.697 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)
114912.602 ERROR src/caps.rs:347 Failed to read providers directory: No such file or directory (os error 2)

annoying errors (which is not error btw)

@mitya52
Copy link
Member

mitya52 commented Mar 26, 2025

if you found this in the merged caps that lsp makes, then it's a bug

in the http://localhost:8001/v1/caps

similar_models ... we don't need it for cloud or byok

we need it for backward compat with old caps / byok style, but of course not for new internal caps format

Why would we need more than one embedding model?

for example I want to use openai embedder instead of gte (byok)

about embedding model: it has n_ctx as max input tokens, we should have this limitation inside the code (or we'll have the issues, I'm sure)

MarcMcIntosh and others added 3 commits March 26, 2025 17:05
* un-disable input when limit reached.

* chore: add `compression_strength` to tool messages

* add paused state to thread

* add hook to pause auto send based on compression.

* ui: let the user know that their chat is being compressed.

* fix: linter issues after removing `limitReached` information call out.

* fix: also use `/links` to decided if a new chat should be suggested.

* refactor: remove `useTotalTokenUsage` hook.

* add comments about `newChatSuggested`.

* pause and unpaused using newChatSuggested.

* fix(NewChatSuggested): use a hook to get the compression strength.

* feat: add second condition for pausing the chat.

* case: it might be posable for many messages with out compression.
MDario123 and others added 7 commits March 28, 2025 19:11
it was only used in telemetry, and it was replaced by IntoResponse
* wip: remove attach file checkbox.

* feat: attach files button.

* test: active file is no longer attached by default.

* add an event for the ide to attach a file to chat.

* fix: remove attached files after submit.
- claude-3-5-sonnet-latest
- claude-3-5-haiku-latest

model_defaults:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ui

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct CodeAssistantCaps {
#[serde(deserialize_with = "normalize_string")]
pub cloud_name: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// "refact", "refact_self_hosted"

olegklimov
olegklimov previously approved these changes Apr 25, 2025
alashchev17
alashchev17 previously approved these changes Apr 25, 2025
@alashchev17 alashchev17 merged commit 860578a into dev Apr 25, 2025
9 checks passed
@alashchev17 alashchev17 deleted the byok-rework branch April 25, 2025 22:17
@alashchev17 alashchev17 restored the byok-rework branch April 29, 2025 13:30
alashchev17 added a commit that referenced this pull request Apr 29, 2025
This reverts commit 860578a.
@alashchev17 alashchev17 mentioned this pull request Apr 29, 2025
alashchev17 added a commit that referenced this pull request Apr 29, 2025
This reverts commit 860578a.
@alashchev17 alashchev17 deleted the byok-rework branch April 29, 2025 15:58
MarcMcIntosh added a commit that referenced this pull request Apr 30, 2025
* add metadata in caps

* wip: add coin count to history item

* add cost info to caps select

* wip: replace message with coins in menu.

* wip: handle out of coins error.

* Byok rework (#611)

* ui: add better ui for caps cost.

* wip: history item, calculate cost.

* remove usage editing in lsp

* wip: calculate total price in history.

* refactor: display coin usage

* chore: update caps stubs.

* chore(coins): remove old agent usage logic

* refactor(startPollingForUser): rename useAgentUsage.

* feat: add warning when balance starts to get low.

* fix: add metadata to caps after moving caps

* fix: re-add metadata to caps.

* fix: caps type guard.

* fix: <p> being a descendant of <p>

* fix typo: ballance -> balance

* fix: typo `ballance`

* small change to trigger actions.

* fix: caps pricing maybe undefined when using byok

* fix type check for pricing being undefined when using byok.

* fix: cost

* usage counter should be two decimal points.

* update calculate total cost.

* missed toFixed on total cost.

* round total cost.

---------

Co-authored-by: Kirill Starkov <[email protected]>
Co-authored-by: Humberto Yusta Gómez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants