Update recent released papers

NL2Code · web-flow · commit 4c713f5c6a57 · 2024-06-17T15:22:14.000+08:00
diff --git a/_posts/2000-01-01-Related Surveys.md b/_posts/2000-01-01-Related Surveys.md
@@ -4,7 +4,8 @@ author: coder
 date: 2000-01-01 00:00:00 +0800
 categories: [other]
 tags: [surveys]
-pin: false
+pin: true
 ---
 ## [Deep Learning Based Code Generation Methods | A Literature Review](https://arxiv.org/ftp/arxiv/papers/2303/2303.01056.pdf)
 ## [A Survey on Language Models for Code](https://arxiv.org/pdf/2311.07989.pdf)
+## [A Survey on Large Language Models for Code Generation](https://arxiv.org/abs/2406.00515)
diff --git a/_posts/2023-10-10-SWE-bench.md b/_posts/2023-10-10-SWE-bench.md
@@ -5,10 +5,11 @@ date: 2023-10-10 00:00:00 +0800
 categories: [arxiv]
 tags: [benchmarks]
 math: true
+pin: true
 ---
 
 - 🗂️Benchmark Name: [SWE-bench](https://arxiv.org/pdf/2310.06770.pdf)
 - 📚Publisher: `Arxiv`
 - 🏠Author Affiliation: `Princeton University`; `Princeton Language and Intelligence`; `University of Chicago`
-- 🔗URL: [https://www.swebench.com/](https://www.swebench.com/)
+- 🔗Leaderboard: [https://www.swebench.com/](https://www.swebench.com/)
 - Scenario: `Real-World GitHub Issues`
diff --git a/_posts/2024-04-05-SWE-Agent.md b/_posts/2024-04-05-SWE-Agent.md
@@ -8,7 +8,7 @@ math: true
 pin: false
 ---
 
-- 📙Paper: [SWE-agent](https://github.com/princeton-nlp/SWE-agent)
+- 📙Paper: 🔥🔥🔥[SWE-agent](https://github.com/princeton-nlp/SWE-agent)
 - 📚Publisher: `arxiv`
 - 🏠Author Affiliation: `Princeton University`
 - Contribution: SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories. On SWE-bench, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.
diff --git a/_posts/2024-04-08-AutoCodeRover.md b/_posts/2024-04-08-AutoCodeRover.md
@@ -0,0 +1,14 @@
+---
+title: AutoCodeRover
+author: coder
+date: 2024-04-08 00:00:00 +0800
+categories: [arxiv]
+tags: [models]
+math: true
+---
+
+- 📙Paper: [AutoCodeRover: Autonomous Program Improvement](https://arxiv.org/pdf/2404.05427)
+- 📚Publisher: `arxiv`
+- 🏠Author Affiliation: `National University of Singapore, Singapore`
+- Leaderboard: [https://www.swebench.com](https://www.swebench.com)
+- Abstract: Researchers have made significant progress in automating the software development process in the past decades. Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding. Nevertheless software engineering involves the process of program improvement apart from coding, specifically to enable software maintenance (e.g. bug fixing) and software evolution (e.g. feature additions). In this paper, we propose an automated approach for solving GitHub issues to autonomously achieve program improvement. In our approach called AutoCodeRover, LLMs are combined with sophisticated code search capabilities, ultimately leading to a program modification or patch. In contrast to recent LLM agent approaches from AI researchers and practitioners, our outlook is more software engineering oriented. We work on a program representation (abstract syntax tree) as opposed to viewing a software project as a mere collection of files. Our code search exploits the program structure in the form of classes/methods to enhance LLM's understanding of the issue's root cause, and effectively retrieve a context via iterative search. The use of spectrum based fault localization using tests, further sharpens the context, as long as a test-suite is available. Experiments on SWE-bench-lite which consists of 300 real-life GitHub issues show increased efficacy in solving GitHub issues (22-23% on SWE-bench-lite). On the full SWE-bench consisting of 2294 GitHub issues, AutoCodeRover solved around 16% of issues, which is higher than the efficacy of the recently reported AI software engineer Devin from Cognition Labs, while taking time comparable to Devin. We posit that our workflow enables autonomous software engineering, where, in future, auto-generated code from LLMs can be autonomously improved.
diff --git a/_posts/2024-06-04-CodeR.md b/_posts/2024-06-04-CodeR.md
@@ -0,0 +1,18 @@
+---
+title: CodeR
+author: coder
+date: 2024-06-04 00:00:00 +0800
+categories: [arxiv]
+tags: [models]
+math: true
+pin: true
+---
+
+- 📙Paper: [CodeR: Issue Resolving with Multi-Agent and Task Graphs](https://arxiv.org/pdf/2406.01304)
+- 📚Publisher: `arxiv`
+- 🏠Author Affiliation: `Huawei`, `Chinese Academy of Science`, `Singapore Management University`, `Peking University`
+- Leaderboard: SOTA on [https://www.swebench.com](https://www.swebench.com) (24.06.04)
+- Contribution: 
+    + We propose CODER, a multi-agent framework with task graphs for issue resolving. Inspired by the issue resolving process by humans in the real world, we design the roles and the actions. For plans, we design a graph data structure that can be parsed and strictly executed. It can ensure the exact execution of the plan and at the same time provide an easy-to-plug interface for plan injection from humans. 
+    + We leverage LLM-generated code for reproducing the issue and the tests in the repository (excluding the verification tests) to get code coverage information. Coverage information improves contextual retrieval based on the keywords in the issue text and does fault localization together with BM25.
+    + We renew the state-of-the-art of SWE-bench lite to 28.33% (85/300) with only one submission per issue.
diff --git a/_posts/2024-06-11-GraphCoder.md b/_posts/2024-06-11-GraphCoder.md
@@ -0,0 +1,17 @@
+---
+title: GraphCoder
+author: coder
+date: 2024-06-11 00:00:00 +0800
+categories: [arxiv]
+tags: [models]
+math: true
+---
+
+- 📙Paper: [GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model](https://arxiv.org/abs/2406.07003)
+- 📚Publisher: `arxiv`
+- 🏠Author Affiliation: `Peking University`, `Chinese Academy of Sciences`, `Huawei`
+- GitHub: [https://github.com/oceaneLIU/GraphCoder](https://github.com/oceaneLIU/GraphCoder)
+- Contribution: 
+    + An approach GraphCoder to enhance the effectiveness of retrieval by a coarse-to-fine process, which considers both structural and lexical context, as well as the dependence distance between the completion target and the context;
+    + A graph-based representation CCG (code context graph) of source code to capture relevant long-distance context for predicting the semantics of code completion target instead of the widely adopted sequence-based one;
+    + Extensive experiments upon 5 LLMs and across 8000 code completion tasks from 20 repositories demonstrate that GraphCoder achieves higher exact match values with reduced retrieval time and overhead in database storage space.