Jan 17 2025 Feedback. |
For virtual chemical databases containing over one billion (1B+) molecules, efficient storage is critical to ensure fast processing and accessibility. MolSoft has developed a highly compressed file format that leverages frequency-based adaptive encoding in internal coordinates, significantly optimizing storage requirements. This format achieves an impressive compression rate, with an average of approximately 400 bytes per conformation stack per molecule. Compared to traditional XYZ storage in single precision, it is roughly 15 times more efficient. A database of 1 billion molecules with conformations can be stored in about 800 GB using this format. The files utilize the .molt extension and are specifically designed for compatibility with GPU-based algorithms such as RIDE, RIDGE, and GigaScreen . To make these files you need to use:
|
Copyright© 1989-2020, Molsoft,LLC - All Rights Reserved. |
This document contains proprietary and confidential information of
Molsoft, LLC. The content of this document may not be disclosed to third parties, copied or duplicated in any form, in whole or in part, without the prior written permission from Molsoft, LLC. |