Hosting the Database locally

Hosting a local database has several advantages:

  • Latency is lower, so querying and constructing entry graphs will be faster.

  • If ChemRecon is working with write access to the database, once procedural relations have been calculated, the result will be cached in the database, optimizing subsequent queries.

Hosting the database requires:

  • An installation of, and working familiarity with, Docker.

  • The PostgreSQL client binaries (Download).

  • Python 3.12.

  1. Clone the Git repository:

    git clone https://gitlab.com/casbjorn/chemrecon && cd chemrecon
    

    Use pip, uv, or a similar tool to set up a Python environment based on the provided pyproject.toml file.

  2. Use docker compose to deploy the database container.

    docker compose -f ./local_db/compose.yaml up
    

    The compose.yaml file contains parameters which can be customized, such as the username and password of the default Postgres user account. By default, the database port of 5432 will be forwarded to port 54320 on the host machine. This behaviour can be changed.

  3. Use the provided initialization script to set up database users and apply the schema. This script needs to be provided with access credentials (those defined when deploying the container), as well as parameters for the dev (write access) user and the public user. By default, it uses those located in .src/chemrecon/database/connection_params, which can be customized if desired. If these files are changed, ChemRecon will use the new parameters when connecting via connect_local() and connect_local_dev().

    python ./src/chemrecon/scripts/initialize_database.py
    --host=localhost
    --port=54320
    --database=chemrecon_db
    --username=postgres
    --password=testpassword
    --params=./src/chemrecon/database/connection_params/local_docker_pub.dbinfo
    --devparams=./src/chemrecon/database/connection_params/local_docker_dev.dbinfo
    
  4. Finally, populate the database with data.

    Download the dump .sql file from TODO.

    Use pg_restore to populate the database with this data.

    pg_restore
    -h=localhost
    -p=54320
    -db_name=chemrecon_db
    -U=postgres
    -W
    --data_only
    "PATH_TO_DOWNLOADED_DUMP_FILE"
    

    This step may take some time to complete.

After hosting the database, when using ChemRecon, connect with connect_local() or connect_local_dev() instead of connect_public() to connect to this database.