Jan 17 2025 Feedback.
Contents
 
Introduction
Help Videos
Reference Guide
Getting Started
Protein Structure
Molecular Graphics
Slides & ActiveICM
Sequences & Alignments
Protein Modeling
Cheminformatics
Learn and Predict
Docking
Virtual Screening
Molecular Dynamics
MolScreen
3D Ligand Editor
Tables and Plots
Local Databases
 Make Database
 Browse Database
 Edit Database
 Query Local Database
 3DMOLT
ICM-Scarab
KNIME
Tutorials
FAQs
 
Index
PrevICM User's Guide
18.5 Eficient Storage for Ultra-Large Virtual Chemical Databases
Next

For virtual chemical databases containing over one billion (1B+) molecules, efficient storage is critical to ensure fast processing and accessibility. MolSoft has developed a highly compressed file format that leverages frequency-based adaptive encoding in internal coordinates, significantly optimizing storage requirements. This format achieves an impressive compression rate, with an average of approximately 400 bytes per conformation stack per molecule. Compared to traditional XYZ storage in single precision, it is roughly 15 times more efficient. A database of 1 billion molecules with conformations can be stored in about 800 GB using this format. The files utilize the .molt extension and are specifically designed for compatibility with GPU-based algorithms such as RIDE, RIDGE, and GigaScreen . To make these files you need to use:

  • Chemistry/Generate Conformers. If you have a GINGER GPU license check the tab "Compressed Database using GINGER". For users with a regular CPU license, choose the "Compressed Database" tab instead.


Prev
Query Local Database
Home
Up
Next
Make Database

Copyright© 1989-2020, Molsoft,LLC - All Rights Reserved.
This document contains proprietary and confidential information of Molsoft, LLC.
The content of this document may not be disclosed to third parties, copied or duplicated in any form,
in whole or in part, without the prior written permission from Molsoft, LLC.