Skip to content

Conversation

Hoblovski
Copy link
Collaborator

What type of PR is this?

Add regression testing scripts and incorporate CI integration.
Save you from all those nights wondering "looks like it works but does it break anything"?
Sleep well.

Check the PR title.

  • This PR title match the format: <type>(optional scope): <description>
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Attach the PR updating the user documentation if the current PR requires user awareness at the usage level. User docs repo

(Optional) Translate the PR title into Chinese.

实现系统级回归测试,集成进 CI。

Basically 就是验证 pr 生成的 uniast 和 main 生成的 uniast 是一致的。
逻辑很简单,编译一个 main 分支的可执行文件,编译一个 pr 分支的可执行文件,
两个都在 testdata 跑跑,然后判断一下 uniast json 是否相同。

这个是系统级测试,完全把内部实现当成黑盒,所以相对 robust。

上周还画了个饼做 property based testing,正好作为 uniast 的一个可执行规范。
但是现有代码实在比较难做 PBT…… 所以先做能用的 system regression

跳过回归测试

把 regression.yml 里面被注释的那行加进来就行
这样做完以后,只需要 PR title 加 [NO-REGRESSION-TEST] 就可以跳过。
但是这样太粗了,按理说应该是 pr 作者声明哪些通过哪些不通过

手动回归测试

你可以手动回归测试,从一个干净的仓库开始:

OLD=731ba8d                                             &&\
NEW=feat/regression                                     &&\
echo "--------------------"                             &&\
rm -rf old new abcoder_{old,new}                        &&\
git checkout $OLD && go build -o abcoder_old            &&\
git checkout $NEW && go build -o abcoder_new            &&\
( OUTDIR=old ABCEXE=./abcoder_old ./script/run_all_testdata.sh &
  OUTDIR=new ABCEXE=./abcoder_new ./script/run_all_testdata.sh ) &&\
./script/diffjson.py old new

应该输出除了 go 所有别的都没问题。
go 有问题是因为 Yi 加了 Go 的类型参数。

(Optional) More detailed description for this PR(en: English/zh: Chinese).

en:
zh(optional):

(Optional) Which issue(s) this PR fixes:

(optional) The PR that updates user documentation:

@Hoblovski
Copy link
Collaborator Author

哦对了,如果报错可以去 log 里面找到 Artifact download URL 然后检视

return c.Conn.Close()
}

// Extra wrapper around json rpc to
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里包了一层,本来是想做 retry on empty response (就不用 sleep 等 lsp 了)的,但是发现实在做不了,只能 sleep。
不过包一层也方便以后做 unified transparent LSP request caching

@Hoblovski Hoblovski requested a review from Copilot September 1, 2025 12:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements regression testing infrastructure to validate that code changes don't introduce behavioral regressions in the uniast generation. The system compares uniast JSON outputs between the main branch and PR branches by building separate executables and running them against all testdata.

Key changes:

  • Adds automated regression testing workflow that compares uniast outputs between main and PR branches
  • Implements JSON comparison tooling with detailed diff reporting
  • Includes code refactoring to separate LSP method implementations into dedicated files

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
.github/workflows/regression.yml CI workflow for automated regression testing
script/run_all_testdata.sh Shell script to generate uniast for all testdata
script/diffjson.py Python tool for comparing JSON outputs with detailed diff reporting
script/requirements.txt Python dependencies for JSON comparison tool
lang/lsp/lsp_methods.go Extracted LSP method implementations from main file
lang/lsp/lsp.go Refactored to remove method implementations and improve structure
lang/lsp/client.go Added generic Call wrapper method
lang/lsp/handler.go Added diagnostic notification handling
lang/lsp/testutils.go Increased LSP server wait time
lang/lsp/clients_test.go Updated test expectations and removed workspace symbol test
testdata/rust/0_rust2/src/entity/inter.rs Updated test data with uncommented code

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 2, 2025

lsp 包改动没必要这么多吧

@Hoblovski
Copy link
Collaborator Author

lsp 包改动没必要这么多吧

主要就是把 lsp.go 里面所有 textDocument/*** 择出来到 lsp_method.go。
lsp.go 里只放了基础类型定义。

方法实现完全没变,就是挪个地方。

原因是我在优化 parser,以后直接在 jsonrpc 层做透明 cache。
如果再在 lsp.go 加内容,lsp.go 就太长了,不太方便定位。

没改行为,在测例上功能是完全一致的。

@Hoblovski
Copy link
Collaborator Author

差不多我把 py 的优化弄到头了(瓶颈已经在单线程的 pylsp 了),有

  1. 透明的 LSP 请求缓存
  2. 可选的 profiling
  3. 命令行参数:和 -exclude 相对的 -include
  4. 命令行参数:其他语言支持 -no-need-test 的接口
  5. (一系列 collect/export 优化)

它们会改动 parser 代码,但是我可以测试保证代码行为一定是不变的。

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 5, 2025

所以优化有效果么? @Hoblovski

@Hoblovski
Copy link
Collaborator Author

所以优化有效果么? @Hoblovski

  1. 把 O(N^2) 的循环优化,按 uri 分类: -14% time (pytest N=2390)

循环:https://github.com/cloudwego/abcoder/blob/main/lang/collect/export.go#L124

  1. 透明缓存(主要是 semanticToken 加了缓存会很好): -80% (requests N=924)
  • 从 >210s 到 30s
  • 透明缓存的好处还有就是我们不用手动缓存了,更简单,更不容易出错。
  • 透明缓存对只有 semanticTokens/full 的语言效果尤其明显

我才发现 CI 里老和新的应该分两个 step,这样还能对比效率(虽然不能并行)

@Hoblovski Hoblovski merged commit 70c2e1c into main Sep 8, 2025
4 checks passed
@Hoblovski Hoblovski mentioned this pull request Sep 8, 2025
3 tasks
@Hoblovski Hoblovski deleted the feat/regression branch September 8, 2025 12:04
@SamYuan1990 SamYuan1990 mentioned this pull request Sep 20, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants