The chembl-orm#
version: 34.0.0a0
The chembl-orm enables interaction with the ChEMBL database via SQLAlchemy and Python. It contains an object relational mapper (ORM) module and a separate query module with some present queries against ChEMBL.
There is online documentation for chembl-orm.
This is a third party package and I have no affiliation with the EBI or the ChEMBL team. Please note, that I have only tested this against the ChEMBL SQLite version.
Installation instructions#
At present, chembl-orm is undergoing development and no packages exist yet on PyPi. Therefore it is recommended that you install in either of the two ways below.
Installation using conda#
I maintain a conda package in my personal conda channel. To install from this please run:
conda install -c cfin -c bioconda -c conda-forge chembl-orm
There are currently builds for Python v3.8, v3.9, v3.10 for Linux-64 and Mac-osX.
Please keep in mind that all development is carried out on Linux-64 and Python v3.8/v3.9. I do not own a Mac so can’t test on one, the conda build does run some import tests but that is it.
Installation using pip#
You can install using pip from the root of the cloned repository, first clone and cd into the repository root:
git clone git@gitlab.com:cfinan/chembl-orm.git
cd chembl-orm
Install the dependencies:
python -m pip install --upgrade -r requirements.txt
Then install using pip
python -m pip install .
Or for an editable (developer) install run the command below from the root of the repository. The difference with this is that you can just to a git pull
to update, or switch branches without re-installing:
python -m pip install -e .
Conda dependencies#
There are also conda yaml environment files in ./resources/conda/envs
that have the same contents as requirements.txt
but for conda packages, so all the pre-requisites. I use this to install all the requirements via conda and then install the package as an editable pip install.
However, if you find these useful then please use them. There are Conda environments for Python v3.8, v3.9, v3.10.
Run the tests#
After installation you will may to run the tests. If you have cloned the repository you can run the command below from the root of the repository:
Run the tests using
pytest ./tests
These are only import tests. If you want to test the actual ORM against a working copy of the database, then you can use orm-test-connect
in the SQLAlchemy-config package. This should be installed when you install the ChEMBL ORM. This can be run as follows (using the connection parameters defined in the following section in a ~/.db.ini
file):
$ orm-test-connect -vv "chembl_33_mysql" "chembl_orm.orm"
=== orm-test-connect (sqlalchemy_config v0.2.0a3) ===
[info] 10:52:17 > config value: None
[info] 10:52:17 > db value: chembl_mysql
[info] 10:52:17 > module value: chembl_orm.orm
[info] 10:52:17 > verbose value: 2
[info] running queries: 100%|------------| 79/79 [00:00<00:00, 207.43 queries/s]
[info] 10:52:18 > *** END ***
Database configuration#
The package uses SQLAlchemy to handle database interaction. This means that in theory you are not restricted to a particular database backend. In practice most testing/development will be performed against SQLite and MySQL. So, if you use something else and run into issues, please submit an issue or get in contact.
Any database connection parameters can be supplied on the command line or via a configuration file. You can supply a full connection URL directly on the command line. However, this is not a good idea if your database is password protected. In this case you should supply the connection parameters in a connection file .ini
file. They should be set out as below:
[chembl_33_sqlite]
# An SQLAlchemy connection URL, see:
# https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.engine_from_config
# All connection options here should be prefixed with umls
# Make sure passwords are URL escaped:
# import urllib.parse
# PW = urllib.parse.quote_plus(PW)
# Also, don't forget to escape any % that are in the URL (with a second %)
db.url = sqlite:////data/chembl_33.db
[chembl_33_mysql]
# Connection to localhost
db.url = mysql+pymysql://user:password@127.0.0.1/chembl_33
[chembl_33_postgres]
# Connection to localhost
db.url = postgresql+psycopg2://://user:password@127.0.0.1/chembl_33
Then to use these from the command line you can supply the section header from the config file to the script, so chembl_33_mysql
for the connection to the MySQL database.
Versioning#
The major version of the chembl-orm package is versioned in the same way as the actual ChEMBL release. I will endeavor to keep it current. If you are using an old version of ChEMBL then you should switch branches to the same version that matches the database you have.
Change log#
version 33.0.0a0
#
Initial build and push
version 33.1.0a0
#
API - Added a dictionary of drug salts to the example data.
API - Added index tables to the ORM schema
chembl_orm.orm.TermIndexLookup
andchembl_orm.orm.TermIndexMap
.API - Added function (
chembl_orm.index.build_chembl_index
) to index thedrug_indications
table into separate index table.API - Added instance method for
chembl_orm.queries.ChemblQuery.get_drugs_for_indication
, that performs an search of the index for drugs matching a supplied indication string.API - Added instance method for
chembl_orm.queries.ChemblQuery.map_drug_name
, that attempts to map a free text drug name into a ChEMBL ID.SCRIPTS - Added a command line program
chembl-index
to build the ChEMBL index tables.
version 33.2.0a0
#
BUILD - Updated to use SQLAlchemy>=2.
API - Fixed SQLAlchemy 2 incompatibilities in the ORM.
version 34.0.0a0
#
API - Updated ORM for ChEMBL 34, two columns added.