Polymath

Agent leveraging auxiliary tools to improve performance in selected problem domains. This is the reproduction package for our research paper Logic.py: Bridging the Gap between LLMs and Constraint Solvers.

Project configuration

Initial Setup

Currently, there is no default LLM inference provider availabe. We use an internal provider at Meta, which is not part of the open source release. To get started, create an implementation of chat_completion.py for your inference back-end in inference/your_inference_provider.py. Then replace the DummyChatCompletion in inference/chat_completion_factory.py by your new provider. If your provider requires secrets, we suggest to use the dotenv library and add them to a .env file. You can use the .env-example as a starting point.

Set up Conda environment:

conda create --yes --file environment.yml
conda activate polymath

Log into your huggingface account to download datasets:

huggingface-cli login

On HuggingFace, you need to request access to the following two datasets. Access is granted immediately upon filling out a form:

Finally, install datasets and remaining dependencies:

./scripts/setup.sh

Update dependencies

conda env update --file environment.yml

Run tests

Note: Some unit tests expect a working LLM inference set up.

To run all tests, use:

python -m unittest discover

To only run specifc tests, you can run:

python -m unittest agent.symex.tests.test_module_with_type_info_factory -k test_single

Benchmarks

ZebraLogicBench

To run the benchmark set using our logic agent, use:

python -m agent.logic.zebra_benchmark

This will produce an output JSON file that we evaluate using the original ZeroEval environment.

To set up a ZeroEval Conda environment, follow these instructions adapted from their README.md:

cd lib/ZeroEval
conda create -n zeroeval python=3.10
conda activate zeroeval
pip install vllm -U
pip install -r requirements.txt

Afterwards, you can run their evaluation using:

python src/evaluation/zebra_grid_eval.py

This will update result_dirs/zebra-grid.summary.md to now include the output JSON generated by our logic agent.

FOLIO

Support for FOLIO is a work in progress and will be updated once the integration is complete.

τ-bench:

Support for τ-bench is a work in progress and will be updated once the integration is complete.

Submodules

This repository uses open source repositories, or public forks thereof, to make it easy for users to build the respective libraries and tools. This is purely for the purpose of convenience, and users are free to download these same tools and libraries from their original repositories.

License

Polymath is CC BY NC 4.0 licensed, as found in the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agent		agent
concurrency		concurrency
encoding		encoding
inference		inference
lib		lib
logger		logger
scripts		scripts
.env-example		.env-example
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polymath

Project configuration

Initial Setup

Update dependencies

Run tests

Benchmarks

ZebraLogicBench

FOLIO

τ-bench:

Submodules

License

About

Releases

Packages

Contributors 2

Languages

License

facebookresearch/polymath

Folders and files

Latest commit

History

Repository files navigation

Polymath

Project configuration

Initial Setup

Update dependencies

Run tests

Benchmarks

ZebraLogicBench

FOLIO

τ-bench:

Submodules

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages