将大模型转化为软件工程智能体,用于修复 GitHub 存储库中的错误和问题。
Kilian Lieret a0db8b187e CI: Reduce MLC false failures; make work on github | 4 周之前 | |
---|---|---|
.devcontainer | 2 月之前 | |
.github | 4 周之前 | |
assets | 1 月之前 | |
config | 4 周之前 | |
docker | 1 月之前 | |
docs | 4 周之前 | |
inspector | 4 月之前 | |
make_demos | 4 月之前 | |
scripts | 4 周之前 | |
sweagent | 4 周之前 | |
tests | 4 周之前 | |
trajectories | 1 月之前 | |
.dockerignore | 5 月之前 | |
.git-blame-ignore-revs | 4 月之前 | |
.gitignore | 2 月之前 | |
.pre-commit-config.yaml | 4 周之前 | |
CHANGELOG.md | 4 周之前 | |
CODE_OF_CONDUCT.md | 5 月之前 | |
CONTRIBUTING.md | 5 月之前 | |
Dockerfile | 4 月之前 | |
LICENSE | 6 月之前 | |
README.md | 4 周之前 | |
build_deploy.sh | 6 月之前 | |
codecov.yml | 6 月之前 | |
environment.yml | 5 月之前 | |
mkdocs.yml | 4 周之前 | |
mlc_config.json | 4 周之前 | |
pyproject.toml | 4 周之前 | |
release_dockerhub.sh | 3 月之前 | |
requirements.txt | 2 月之前 | |
run.py | 4 周之前 | |
run_replay.py | 1 月之前 | |
setup.sh | 3 月之前 | |
setup_ctf.sh | 1 月之前 | |
start_web_ui.sh | 4 月之前 |
Documentation | Discord | Preprint | EnIGMA preprint
SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories and more.
On SWE-bench, SWE-agent resolves 12.47% of issues of the full test set and 23% of issues of SWE-bench lite. SWE-agent EnIGMA solves more than 3x more challenges of the offensive cybersecurity NYU CTF benchmark than the previous SOTA agent.
We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI). Read more about it in our paper!
SWE-agent is built and maintained by researchers from Princeton University.
👉 Try SWE-agent in your browser: (more information)
Read our documentation to learn more:
Our most recent lecture touches on the project's motivation, showcases our research findings and provides a hands-on tutorial on how to install, use, and configure SWE-agent:
SWE-agent: EnIGMA (Enhanced Interactive Generative Model Agent) is a mode for solving offensive cybersecurity challenges. EnIGMA achieves SOTA on multiple cybersecurity benchmarks (see leaderboard). The EnIGMA project introduced multiple novelties that are available to all use cases of SWE-agent, such as Interactive Agent Tools and a Summarizer to handle long outputs.
Contact person: John Yang and Carlos E. Jimenez (Email: johnby@stanford.edu, carlosej@princeton.edu).
If you found this work helpful, please consider citing it using the following:
@misc{yang2024sweagent,
title={SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
author={John Yang and Carlos E. Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik Narasimhan and Ofir Press},
year={2024},
eprint={2405.15793},
archivePrefix={arXiv},
primaryClass={cs.SE}
}
MIT. Check LICENSE
.