Kai Wen 38dbbe25e1 RAFT + readme + small sample dataset (#218)		7 月之前
..
eval-data	816f60c0f1 Initial Commit	1 年之前
eval-scripts	38dbbe25e1 RAFT + readme + small sample dataset (#218)	7 月之前
retrievers	c849d11833 Adding BM25 and GPT retrievers (#61)	1 年之前
README.md	c849d11833 Adding BM25 and GPT retrievers (#61)	1 年之前
get_llm_responses.py	5884d59ee1 Fix use_wandb in ast eval, responses file deletion, wandb artifacts renaming (#115)	1 年之前
get_llm_responses_retriever.py	de5dfe263e update(anthropic): #63 to (0.3.x) (#64)	1 年之前

Gorilla

Get Started

Getting GPT-3.5-turbo, GPT-4 and Claude Responses (0-Shot)

To get LLM responses for the API calls, use the following command:

python get_llm_responses.py --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub

Getting Responses with Retrievers (`bm25` or `gpt`)

python get_llm_responses_retriever.py --retriever bm25 --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub --api_dataset ../data/api/torchhub_api.jsonl

Evaluate the Response with AST tree matching

After the responses of the LLM is generated, we can start to evaluate the generated responses with respect to our dataset:

cd eval-scripts
python ast_eval_th.py --api_dataset ../../data/api/torchhub_api.jsonl --apibench ../../data/apibench/torchhub_eval.json --llm_responses ../eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl

README.md

Gorilla

Get Started

Getting GPT-3.5-turbo, GPT-4 and Claude Responses (0-Shot)

Getting Responses with Retrievers (bm25 or gpt)

Evaluate the Response with AST tree matching

Getting Responses with Retrievers (`bm25` or `gpt`)