Label Studio是一个具有标准化输出格式的多类型数据标记和注释工具。

Sergei Ivashchenko c7b4b98e5a feat: LEAP-1026: Add option for including annotation history (#5861) 13 小时之前
.devcontainer c4aefec0a7 VS Code Devcontainers (#956) 2 年之前
.github 296773d0f6 ci: PLT-348: Force merge dependabot PR 6 天之前
deploy 6bef97609b chore: remove requirements.txt and setup.py (#5345) 3 月之前
docs ad94fd4287 docs: Add word break to avoid layout bug (#5889) 1 天之前
images 1eac65ee98 Add images 3 年之前
label_studio c7b4b98e5a feat: LEAP-1026: Add option for including annotation history (#5861) 13 小时之前
licenses 9f81029770 feat: DEV-2750: Add ubi-based docker image (#2610) 1 年之前
prometheus 1673802176 docs: Add documentation for setting up local MinIO installation (#3945) 1 年之前
scripts d91c10511d fix: RND-9: Local File Storage doesn't work with token and raises 403 (#5686) 1 月之前
web 303282b208 fix: LEAP-1049: support loading TimeSeries URL from sub-property of data (#5610) (#5883) 2 天之前
.black 5cb4e1b1c4 Feature/webhooks (#1156) 2 年之前
.codespellrc c9bd98b8f0 ci: fix codespell 1 年之前
.dockerignore 33841412a0 ci: PLT-213: Generate frontend version files inside Docker (#5617) 1 月之前
.gitattributes 539c663f4e Configure .gitattributes to handle line endings on windows (#1196) 2 年之前
.gitignore 46c944f739 docs: Style and content updates (#5407) 1 周之前
.gitleaks.toml 3ebfef0632 ci: gitleaks skip docs/ 2 周之前
.pre-commit-config.yaml 2660ec9857 refactor: echo 27: Introducing biome (#5759) 2 周之前
.pre-commit-dev.yaml aea05a3c9f fix: pre-commit when --all-files is use on manual, for example in 'make fmt-al' (#5819) 1 周之前
.pylintrc 60e4d65544 Common lint/autoformat config (#1183) 2 年之前
CODE_OF_CONDUCT.md 44f850d6ee Initial Release. Label Studio is brought to you by Maxim Tkachenko, Nikita Schevchenko, Nikolai Liubimov, and Michael Malyuk. Heartex 2019 4 年之前
CONTRIBUTING.md b7aecdbcde docs: DOC-174: Update docs to reflect monorepo (#5329) 3 月之前
Dockerfile 33841412a0 ci: PLT-213: Generate frontend version files inside Docker (#5617) 1 月之前
Dockerfile.cloudrun 7f3d25aca8 DEV-3324: fix stateless single click deployments 1 年之前
Dockerfile.heroku 7f3d25aca8 DEV-3324: fix stateless single click deployments 1 年之前
Dockerfile.hgface 9b73f693cc ci: Add dockerfile for hgface 1 年之前
Dockerfile.redhat 0d5f7eb418 feat: LEAP-65: Use poetry for dependecies management (#4888) 6 月之前
Dockerfile.testing 6bef97609b chore: remove requirements.txt and setup.py (#5345) 3 月之前
LICENSE 44f850d6ee Initial Release. Label Studio is brought to you by Maxim Tkachenko, Nikita Schevchenko, Nikolai Liubimov, and Michael Malyuk. Heartex 2019 4 年之前
MANIFEST.in 6bef97609b chore: remove requirements.txt and setup.py (#5345) 3 月之前
Makefile a7162a086f feat: DIA-868: Component library + Chip Input (#5358) 3 月之前
NOTICE 6ec1e79e0f Release 1.0.0 (#652) 3 年之前
README.md e56ddfba8e docs: Fix README with poetry (#5679) 1 月之前
app.json aa384bc290 ci: revert cloudrun changes 1 年之前
azuredeploy.json 4207dba02a fix: DEV-510: fix Azure Single Click template (#2127) 2 年之前
azuredeploy.parameters.json 85bbd29586 Release/0.4.0 rc1 (#159) 4 年之前
codecov.yml 0f5d0c5f9d ci: PLT-248: fix codecov fixes path (#5602) 1 月之前
docker-compose.minio.yml 1673802176 docs: Add documentation for setting up local MinIO installation (#3945) 1 年之前
docker-compose.mysql.yml 7f935c992a MySQL support (#1385) 2 年之前
docker-compose.yml d702545ad8 fix: LEAP-514: Removed the ENABLE_MONOREPO_ENV environment variable and set monorepo files as default (#5257) 4 月之前
heroku.yml ba698522ee fix Heroku one-click button 2 年之前
poetry.lock 62d62cdd2b fix: DIA-1123: Handle config with invalid Label tag (no value) (#5888) 1 天之前
pyproject.toml 62d62cdd2b fix: DIA-1123: Handle config with invalid Label tag (no value) (#5888) 1 天之前
roadmap.md d7b6bbce45 docs: Update links to roadmap in new org (#4638) 9 月之前

README.md

GitHub GitHub release

WebsiteDocsTwitterJoin Slack Community

What is Label Studio?

Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.

Gif of Label Studio annotating different types of data

Have a custom dataset? You can customize Label Studio to fit your needs. Read an introductory blog post to learn more.

Try out Label Studio

Install Label Studio locally, or deploy it in a cloud instance. Or, sign up for a free trial of our Enterprise edition..

Install locally with Docker

Official Label Studio docker image is here and it can be downloaded with docker pull. Run Label Studio in a Docker container and access it at http://localhost:8080.

docker pull heartexlabs/label-studio:latest
docker run -it -p 8080:8080 -v $(pwd)/mydata:/label-studio/data heartexlabs/label-studio:latest

You can find all the generated assets, including SQLite3 database storage label_studio.sqlite3 and uploaded files, in the ./mydata directory.

Override default Docker install

You can override the default launch command by appending the new arguments:

docker run -it -p 8080:8080 -v $(pwd)/mydata:/label-studio/data heartexlabs/label-studio:latest label-studio --log-level DEBUG

Build a local image with Docker

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Run with Docker Compose

Docker Compose script provides production-ready stack consisting of the following components:

  • Label Studio
  • Nginx - proxy web server used to load various static data, including uploaded audio, images, etc.
  • PostgreSQL - production-ready database that replaces less performant SQLite3.

To start using the app from http://localhost run this command:

docker-compose up

Run with Docker Compose + MinIO

You can also run it with an additional MinIO server for local S3 storage. This is particularly useful when you want to test the behavior with S3 storage on your local system. To start Label Studio in this way, you need to run the following command:

# Add sudo on Linux if you are not a member of the docker group
docker compose -f docker-compose.yml -f docker-compose.minio.yml up -d

If you do not have a static IP address, you must create an entry in your hosts file so that both Label Studio and your browser can access the MinIO server. For more detailed instructions, please refer to our guide on storing data.

Install locally with pip

# Requires Python >=3.8
pip install label-studio

# Start the server at http://localhost:8080
label-studio

Install locally with poetry

### install poetry
pip install poetry

### set poetry environment
poetry new my-label-studio
cd my-label-studio
poetry add label-studio

### activate poetry environment
poetry shell

### Start the server at http://localhost:8080
label-studio

Install locally with Anaconda

conda create --name label-studio
conda activate label-studio
conda install psycopg2
pip install label-studio

Install for local development

You can run the latest Label Studio version locally without installing the package from pypi.

# Install all package dependencies
pip install poetry
poetry install
# Run database migrations
python label_studio/manage.py migrate
python label_studio/manage.py collectstatic
# Start the server in development mode at http://localhost:8080
python label_studio/manage.py runserver

Deploy in a cloud instance

You can deploy Label Studio with one click in Heroku, Microsoft Azure, or Google Cloud Platform:

Apply frontend changes

For information about updating the frontend, see label-studio/web/README.md.

Install dependencies on Windows

To run Label Studio on Windows, download and install the following wheel packages from Gohlke builds to ensure you're using the correct version of Python:

# Upgrade pip 
pip install -U pip

# If you're running Win64 with Python 3.8, install the packages downloaded from Gohlke:
pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl

# Install label studio
pip install label-studio

Run test suite

To add the tests' dependencies to your local install:

poetry install --with test

Alternatively, it is possible to run the unit tests from a Docker container in which the test dependencies are installed:

make build-testing-image
make docker-testing-shell

In either case, to run the unit tests:

cd label_studio

# sqlite3
DJANGO_DB=sqlite DJANGO_SETTINGS_MODULE=core.settings.label_studio pytest -vv

# postgres (assumes default postgres user,db,pass. Will not work in Docker
# testing container without additional configuration)
DJANGO_DB=default DJANGO_SETTINGS_MODULE=core.settings.label_studio pytest -vv

What you get from Label Studio

Screenshot of Label Studio data manager grid view with images

  • Multi-user labeling sign up and login, when you create an annotation it's tied to your account.
  • Multiple projects to work on all your datasets in one instance.
  • Streamlined design helps you focus on your task, not how to use the software.
  • Configurable label formats let you customize the visual interface to meet your specific labeling needs.
  • Support for multiple data types including images, audio, text, HTML, time-series, and video.
  • Import from files or from cloud storage in Amazon AWS S3, Google Cloud Storage, or JSON, CSV, TSV, RAR, and ZIP archives.
  • Integration with machine learning models so that you can visualize and compare predictions from different models and perform pre-labeling.
  • Embed it in your data pipeline REST API makes it easy to make it a part of your pipeline

Included templates for labeling data in Label Studio

Label Studio includes a variety of templates to help you label your data, or you can create your own using specifically designed configuration language. The most common templates and use cases for labeling include the following cases:

Set up machine learning models with Label Studio

Connect your favorite machine learning model using the Label Studio Machine Learning SDK. Follow these steps:

  1. Start your own machine learning backend server. See more detailed instructions.
  2. Connect Label Studio to the server on the model page found in project settings.

This lets you:

  • Pre-label your data using model predictions.
  • Do online learning and retrain your model while new annotations are being created.
  • Do active learning by labeling only the most complex examples in your data.

Integrate Label Studio with your existing tools

You can use Label Studio as an independent part of your machine learning workflow or integrate the frontend or backend into your existing tools.

Ecosystem

| Project | Description | |-|-| | label-studio | Server, distributed as a pip package | | Frontend library | The Label Studio frontend library. This uses React to build the UI and mobx-state-tree for state management. |
| Data Manager library | A library for the Data Manager, our data exploration tool. | | label-studio-converter | Encode labels in the format of your favorite machine learning library | | label-studio-transformers | Transformers library connected and configured for use with Label Studio |

Roadmap

Want to use The Coolest Feature X but Label Studio doesn't support it? Check out our public roadmap!

Citation

@misc{Label Studio,
  title={{Label Studio}: Data labeling software},
  url={https://github.com/heartexlabs/label-studio},
  note={Open source software available from https://github.com/heartexlabs/label-studio},
  author={
    Maxim Tkachenko and
    Mikhail Malyuk and
    Andrey Holmanyuk and
    Nikolai Liubimov},
  year={2020-2022},
}

License

This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020-2022