Background



ANPDB

The African Natural Products Database (ANPDB) is an open access well curated digital repository that compiles detailed information on natural compounds derived from African medicinal plants, fungi, and other biodiversity. Designed to support drug discovery and medicinal chemistry research, the ANPDB includes structural, physicochemical, and pharmacological data, making it a valuable resource for virtual screening, cheminformatics, and ethnopharmacological studies. By capturing the rich and largely untapped chemical diversity of the African continent, the ANPDB plays a crucial role in promoting the scientific exploration and pharmaceutical potential of Africa’s traditional medicinal knowledge.

Building Process

The building process of the African Natural Products Database (ANPDB) involves several systematic steps to ensure the accurate and comprehensive capture of natural product data from African sources:

  1. Literature Mining and Data Collection: The process begins with extensive mining of scientific literature, ethnobotanical records, theses, and databases to extract information on natural products isolated from African flora, fauna, and microorganisms including data from journal articles, conference presentations, MSc and PhD theses from university libraries across the continent.
  2. Compound Structure Elucidation: Retrieved compounds are curated with accurate chemical structures, which are either drawn manually using cheminformatics tools or obtained from trusted sources. Each structure is validated for correctness and consistency.
  3. Data Annotation and Standardization: Chemical compounds are annotated with standardized metadata, including IUPAC names, SMILES, InChI keys, molecular weight, and other physicochemical properties. Biological source information such as plant species, collection location, and traditional uses is also documented.
  4. Cheminformatics Analysis: Descriptors and drug-likeness parameters (e.g., Lipinski’s Rule of Five, ADMET properties) are calculated using cheminformatics software to make the data suitable for virtual screening and computational drug discovery.
  5. Database Design and Integration: The curated data is organized into a searchable relational database with a user-friendly interface. The system allows for compound browsing, structural searches, and data downloads.
  6. Validation and Quality Control: The database undergoes rigorous validation to ensure data accuracy, completeness, and consistency. Duplicate entries are removed, and each entry is cross-verified with primary sources.
  7. Public Deployment and Updates: Once complete, the ANPDB is deployed online for open access by researchers worldwide. It is periodically updated with new compounds and features to reflect ongoing research and discoveries.
This process integrates ethnopharmacology, analytical chemistry, and informatics to create a robust resource that supports drug discovery from Africa’s rich natural heritage.

Data Sources

Included are natural products isolated from African flora, fauna, and microorganisms including data from journal articles, conference presentations, MSc and PhD theses from university libraries across the continent (for details, see Statistics Page).

Similarity and Substructure Searches

Similarity and substructure searches are powerful tools used in the African Natural Products Database (ANPDB) to explore chemical relationships between compounds. Similarity search identifies compounds with overall structural resemblance to a query molecule, typically using fingerprint-based algorithms and similarity metrics like the Tanimoto coefficient. In contrast, substructure search locates compounds that contain a specific structural motif or fragment within their molecular framework. These search capabilities enhance virtual screening, scaffold hopping, and lead optimization in natural product-based drug discovery.

Scaffold Browser

The shown scaffolds were extracted from the molecules in ANPDB with the program Canvas (Schrödinger LLC, NY). Level 0 scaffolds are herein root monocyclic or policyclic independent structures. Two level 0 entities connected by an aliphatic or functionized linker define a level 1 scaffold, and so on. Those structures are sorted by their frequency of occurrence in all available structures in ANPDB.

Phylogenetic Tree

The phylogenetic tree has been built from the hypothetical most common 16S rRNA sequence for each available strain. This consensus sequence considers all available sequences, but possibly slightly different sequncing attempts of the same high quality, as well as the usually six or more sequences that are encoded in a strain genome. It shows 340 strains, which are parent strains for nearly 1,300 substrains. The tree is therefore comprising nearly two third of the data in ANPDB. Strains that have a common ancestor node within 0.07 were clusered with DendroPy and marked by background colors. The alignment has been performed with Clustalw, phylogenetic analysis with the MEGA software package, and the final editing with the ETE 2 toolkit.

Compound-Protein Interactions

Literature-mined NP–protein relationships were introduced through hyperlinks to the Compound–Protein Relationships in Literature (CPRiL) web server. An NP–protein relationship refers to a functional association between an NP and a protein involving direct interaction, regulation, or being part of each other.

On the other hand, we used the in-house ePharmaLib dataset, which contains 15,148 therapeutically relevant e-pharmacophores, to predict potential target proteins for each NP in ANPDB. The predicted interactions are ranked based on a metric (0 ≤ Tversky score ≤ 1), indicating their likelihood of occurrence. Only statistically significant interactions with a likelihood of at least 70% (i.e., Tversky score ≥ 0.7) were retained.

NMR/MS Data

All 1H and 13C NMR spectra were generated with the command-line tool cxcalc (Marvin, 2023, ChemAxon). The fragmentation patterns in MS and the resulting chemical structures were predicted by the CFM-ID software in positive ESI-MS/MS mode at 10, 20, and 40 V. To facilitate NP structural dereplication, interactive visualization tools have now been implemented: namely, the JSpecView applet and the plotly.js charting library for predicted NMR JCAMP-DX files and MS TXT files, respectively.

Related Databases

NANPDB

NANPDB in Brief NANPDB: Northern African Natural Products Database This refers to an online accessible library of natural products from Northern African sources (plants, animals, fungi and bacteria). NANPDB was built by collecting information on 4,928 NPs and 751 source species, from 831 publications (covering data available in the literature for the period from 1962 to 2019). These were assembled from the main natural product journals and local African journals, as well as from PhD theses in university libraries. The data covers compounds isolated mainly from plants, with contributions from some endophyte, animal (e.g. coral), fungal and bacterial sources. The compounds were identified from source species belonging to families from 4 kingdoms. Computed physicochemical properties, often used to predict drug metabolism and pharmacokinetics (DMPK) properties have been included for each compound in the dataset. ______________________________________________________________________________________ To cite NANPDB, please reference: NANPDB: A Resource for Natural Products from Northern African Sources Fidele Ntie-Kang, Kiran K. Telukunta, Kersten Döring, Conrad V. Simoben, Aurélien F. A. Moumbock, Yvette I. Malange, Leonel E. Njume, Joseph N. Yong, Wolfgang Sippl, and Stefan Günther Journal of Natural Products DOI: 10.1021/acs.jnatprod.7b00283

EANPDB

EANPDB in Brief EANPDB: East African Natural Products Database This refers to an online accessible library of natural products from Eastern African sources (plants, animals, fungi and bacteria). EANPDB was built by collecting information on 1,871 NPs and 302 source species, from 315 publications covering data available in the literature for the period from 1962 to 2019). These were assembled from the main natural product journals and local African journals, as well as from PhD theses in university libraries. The data covers compounds isolated mainly from plants, with contributions from some endophytes, animal (e.g. coral), fungal and bacterial sources. The compounds were identified from source species belonging to families from 4 kingdoms. Computed physicochemical properties, often used to predict drug metabolism and pharmacokinetics (DMPK) properties have been included for each compound in the dataset. ______________________________________________________________________________________ Pharmacoinformatic investigation of medicinal plants from East Africa (manuscript in preparation). Conrad V. Simoben, Ammar Qaseem, Aurélien F. A. Moumbock, Kiran K. Telukunta, Stefan Günther, Wolfgang Sippl, and Fidele Ntie-Kang

AfroDb

AfroDb in Brief AfroDb: a collection of natural products from African medicinal plants with known bioactivities AfroDb represents the largest “drug-like” and diverse collection of 3D structures of NPs covering the geographical region of the entire African continent, which is readily available for download and use in virtual screening campaigns. Computer-aided drug design (CADD) often involves virtual screening (VS) of large compound datasets and the availability of such is vital for drug discovery protocols. We assess the bioactivity and “drug-likeness” of a relatively small but structurally diverse dataset (containing >1,000 compounds) from African medicinal plants, which have been tested and proven a wide range of biological activities. The geographical regions of collection of the medicinal plants cover the entire continent of Africa, based on data from literature sources and information from traditional healers. For each isolated compound, the three dimensional (3D) structure has been used to calculate physico-chemical properties used in the prediction of oral bioavailability on the basis of Lipinski’s “Rule of Five”. A comparative analysis has been carried out with the “drug-like”, “lead-like”, and “fragment-like” subsets, as well as with the Dictionary of Natural Products. A diversity analysis has been carried out in comparison with the ChemBridge diverse database. Furthermore, descriptors related to absorption, distribution, metabolism, excretion and toxicity (ADMET) have been used to predict the pharmacokinetic profile of the compounds within the dataset. Our results prove that drug discovery, beginning with natural products from the African flora, could be highly promising. The 3D structures are available and could be useful for virtual screening and natural product lead generation programs. Literature information shows that the AfroDb compounds have exhibited a broad range of tested biological activities and have been included as a subset of the ZINC database. The AfroDb library has been published by Ntie-Kang et al., PLoS ONE, 2013, 8(10): e78085, doi:10.1371/journal.pone.0078085 and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

CamMedNP

CamMedNP in Brief CamMedNP: Cameroonian Medicinal Plant and Natural Products Database Background: Computer-aided drug design (CADD) often involves virtual screening (VS) of large compound datasets and the availability of such is vital for drug discovery protocols. We present CamMedNP - a new database beginning with more than 2,500 compounds of natural origin, along with some of their derivatives which were obtained through hemisynthesis. These are pure compounds which have been previously isolated and characterized using modern spectroscopic methods and published by several research teams spread across Cameroon. Description: In the present study, 224 distinct medicinal plant species belonging to 55 plant families from the Cameroonian flora have been considered. About 80 % of these have been previously published and/or referenced in internationally recognized journals. For each compound, the optimized 3D structure, drug-like properties, plant source, collection site and currently known biological activities are given, as well as literature references. We have evaluated the “drug-likeness” of this database using Lipinski’s “Rule of Five”. A diversity analysis has been carried out in comparison with the ChemBridge diverse database. Conclusion: CamMedNP could be highly useful for database screening and natural product lead generation programs. The CanMedNP library has been published by Ntie-Kang et al., BMC Complementary and Alternative Medicine, 2013, 13: 88, doi:10.1186/1472-6882-13-88 and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

ConMedNP

ConMedNP in Brief ConMedNP: Congo Basin Medicinal Plant and Natural Products Database We assessed the medicinal value and “drug-likeness” of ∼3200 compounds of natural origin, along with some of their derivatives which were obtained through hemisynthesis. In the present study, 376 distinct medicinal plant species belonging to 79 plant families from the Central African flora have been considered, based on data retrieved from literature sources. For each compound, the optimised 3D structure has been used to calculate physicochemical properties which determine oral availability on the basis of Lipinski's “Rule of Five”. A comparative analysis has been carried out with the “drug-like”, “lead-like”, and “fragment-like” subsets, containing respectively 1726, 738 and 155 compounds, as well as with our smaller previously published CamMedNP library and the Dictionary of Natural products. A diversity analysis has been carried out in comparison with the DIVERSet™ Database (containing 48 651 compounds) from ChemBridge. Our results prove that drug discovery, beginning with natural products from the Central African flora, could be promising. The 3D structures are available and could be useful for virtual screening and natural product lead generation programs. The ConMedNP library has been published by Ntie-Kang et al., RSC Advances, 2014,4: 409-419, doi:10.1039/C3RA43754J and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

WANPDB

WANPDB in Brief WANPDB: West African Natural Products Database The WANPDB is currently under construction

ANAPL

p-ANAPL in Brief p-ANAPL: Pan-African Natural Products Library Background: Natural products play a key role in drug discovery programs, both serving as drugs and as templates for the synthesis of drugs, even though the quantities and availabilities of samples for screening are often limitted. Experimental approach: A current collection of physical samples of > 500 compound derived from African medicinal plants aimed at screening for drug discovery has been made by donations from several researchers from across the continent to be directly available for drug discovery programs. A virtual library of 3D structures of compounds has been generated and Lipinski’s “Rule of Five” has been used to evaluate likely oral availability of the samples. Results: A majority of the compound samples are made of flavonoids and about two thirds (2/3) are compliant to the “Rule of Five”. The pharmacological profiles of thirty six (36) selected compounds in the collection have been discussed. Conclusions and implications: The p-ANAPL library is the largest physical collection of natural products derived from African medicinal plants directly available for screening purposes. The virtual library is also available and could be employed in virtual screening campaigns. The p-ANAPL library has been published by Ntie-Kang et al., PLoS ONE, 2014, 9(3): e90655, doi:10.1371/journal.pone.0090655 and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

AfroCancer

AfroCancer in Brief AfroCancer: African Anticancer Natural Products Library Naturally occurring anticancer compounds represent about half of the chemotherapeutic drugs which have been put in the market against cancer until date. Computer-based or in silico virtual screening methods are often used in lead/hit discovery protocols. In this study, the “drug-likeness” of ∼400 compounds from African medicinal plants that have shown in vitro and/or in vivo anticancer, cytotoxic, and antiproliferative activities has been explored. To verify potential binding to anticancer drug targets, the interactions between the compounds and 14 selected targets have been analyzed by in silico modeling. Docking and binding affinity calculations were carried out, in comparison with known anticancer agents comprising ∼1 500 published naturally occurring plant-based compounds from around the world. The results reveal that African medicinal plants could represent a good starting point for the discovery of anticancer drugs. The small data set generated (named AfroCancer) has been made available for research groups working on virtual screening. The AfroCancer library has been published by Ntie-Kang et al., J. Chem. Inf. Model., 2014, 54 (9): 2433–2450, doi:10.1021/ci5003697 and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

AfroMalariaDB

AfroMalariaDB in Brief AfroMalariaDB: African Antimalarial Natural Products Library Background: Malaria is an endemic disease affecting many countries in Tropical regions. In the search for compound hits for the design and/or development of new drugs against the disease, many research teams have resorted to African medicinal plants in order to identify lead compounds. Three-dimensional molecular models were generated for anti-malarial compounds of African origin (from 'weakly' active to 'highly' active), which were identified from literature sources. Selected computed molecular descriptors related to absorption, distribution, metabolism, excretion and toxicity (ADMET) of the phytochemicals have been analysed and compared with those of known drugs in order to access the 'drug-likeness' of these compounds. Results: In the present study, more than 500 anti-malarial compounds identified from 131 distinct medicinal plant species belonging to 44 plant families from the African flora have been considered. On the basis of Lipinski's 'Rule of Five', about 70% of the compounds were predicted to be orally bioavailable, while on the basis of Jorgensen's 'Rule of Three', a corresponding >80% were compliant. An overall drug-likeness parameter indicated that approximately 55% of the compounds could be potential leads for the development of drugs. Conclusions: From the above analyses, it could be estimated that >50% of the compounds exhibiting anti-plasmodial/anti-malarial activities, derived from the African flora, could be starting points for drug discovery against malaria. The 3D models of the compounds have been included as an accompanying file and could be employed in virtual screening. The AfroMalariaDB library has been published by Onguéné et al., Org. Med. Chem. Lett., 2014, 4: 6, doi:10.1186/s13588-014-0006-x and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

SANCDB

SANCDB in Brief SANCDB: a South African natural compound database Background: Natural products (NPs) are important to the drug discovery process. NP research efforts are expanding world-wide and South Africa is no exception to this. While freely-accessible small molecule databases, containing compounds isolated from indigenous sources, have been established in a number of other countries, there is currently no such online database in South Africa. Description: The current research presents a South African natural compound database, named SANCDB. This is a curated and fully-referenced database containing compound information for 600 natural products extracted directly from journal articles, book chapters and theses. There is a web interface to the database, which is simple and easy to use, while allowing for compounds to be searched by a number of different criteria. Being fully referenced, each compound page contains links to the original referenced work from which the information was obtained. Further, the website provides a submission pipeline, allowing researchers to deposit compounds from their own research into the database. Conclusions: SANCDB is currently the only web-based NP database in Africa. It aims to provide a useful resource for the in silico screening of South African NPs for drug discovery purposes. The database is supported by a submission pipeline to allow growth by entries from researchers. As such, we currently present SANCDB the starting point of a platform for a community-driven, curated database to further natural products research in South Africa. SANCDB is freely available at https://sancdb.rubi.ru.ac.za/. The SANCDB library has been published by Hatherley et al., J. Cheminform., 2015, 7: 29, doi:10.1186/s13321-015-0080-8 and could be doanloaded from this link and subsequently used for virtual screening purposes and appropriately cited.

NPAS

NPAS in Brief NPAS: Natural Products with Available Samples The NPAS is currently under construction

Afrotryp

Afrotryp in Brief Afrotry: African Antitrypanosoma Natural Products Library Recent publications have suggested that African medicinal plants and their derived products are viable source of potential trypanocidal drugs. Nowadays, in silico methods are used in drug discovery processes to identify new potential drug lead compounds. In this study, we have developed a small library named Afrotryp, comprising 3-dimensional chemical structures of tested trypanocidal compounds derived from medicinal plants in Africa (a total of 321 unique chemical structures). We have predicted the pharmacokinetic properties of the library using Qikprop software and employed the three docking/scoring methods implemented in Molecular Operating Environment Dock Tool to assess the affinity of the library dataset towards the binding site of six selected validated anti-Trypanosoma drug targets. It was observed that about 42% of the compounds contained in the Afrotryp dataset were predicted to show a good overall performance in terms of predicted parameters for absorption, distribution, metabolism, elimination (ADME) and toxicity properties. Docking calculations identified 15 compounds with lowest theoretical binding energies toward the studied proteins, nine of which could be suitable for the treatment of stage 2 human African trypanosomiasis (HAT) due to their low polar surface area and the analysis of their binding modes gave basis for the observed unique molecular interactions which exist between the Afrotryp dataset and the studied drug targets. The results lay the foundations for the rational development of novel trypanocidal drugs with improved potency. The Afrotryp library is being considered towards publication in the journal Medicinal Chemistry Research.

Mitishamba

Mitishamba in Brief Mitishamba: A natural products library from medicinal plants in Kenya The Mitishamba database is being developed by scientists from the Chemistry Department, University of Nairobi, Kenya. It currently includes natural products from Kenya plant species, which have important uses in traditional medicine and can be accessed on the link http://mitishamba.uonbi.ac.ke.

ANPDB Team

Jude Y. Betow, Ammar Qaseem, Boris D. Bekono, Kiran K, Telukunta, Aurélien F. A. Moumbock, Conrad V. Simoben, Smith B. Babiaka, Solange A. Tanyi, Vanessa A. Suh, Arianne T. Ndi, Pascal Amoa Onguene, Clovis S. Metuge, Simeon Akame, Cyril T. Namba-Nzanguim, Akachukwu Ibezim, Idris F. Tabi, Yvette I. Malange, Bakoh Ndingkokhar, Leonel E. Njume, Yue Feng, Said Amrani, Oyere T. Ebob, Wolfgang Sippl, Stefan Günther, and Fidele Ntie-Kang,