Kai Wen 38dbbe25e1 RAFT + readme + small sample dataset (#218) | 7 月之前 | |
---|---|---|
.. | ||
eval-data | 1 年之前 | |
eval-scripts | 7 月之前 | |
retrievers | 1 年之前 | |
README.md | 1 年之前 | |
get_llm_responses.py | 1 年之前 | |
get_llm_responses_retriever.py | 1 年之前 |
To get LLM responses for the API calls, use the following command:
python get_llm_responses.py --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub
bm25
or gpt
)python get_llm_responses_retriever.py --retriever bm25 --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub --api_dataset ../data/api/torchhub_api.jsonl
After the responses of the LLM is generated, we can start to evaluate the generated responses with respect to our dataset:
cd eval-scripts
python ast_eval_th.py --api_dataset ../../data/api/torchhub_api.jsonl --apibench ../../data/apibench/torchhub_eval.json --llm_responses ../eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl