GitHub - cursor/eval (original) (raw)
Navigation Menu
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
HumanEval for GPT-3.5/GPT-4
Results are here. Forked from OpenAI's repo.
To generate the completions (after pip installing the requirements), run:
mkdir results
python run.py
Then to evaluate the completion results, run
pip3 install -e .
evaluate_functional_correctness results/name_of_results_file