MarkLogic Database Size per collection
I am currently researching MarkLogic Database Size. Currently in the development database, we have indexed around 78000 documents. The current size of the database is 424 MB. In future we will have at least 2 million Documents. So I applied below formula to get the database size for indexing 2 million documents
Future storage : (424 / 78000) * 2000000= 11 GB (approx).
So as per above formula, max storage of 25 GB is needed.
I would like to know whether the above formula is the correct to approximate database size?
Also I would like to know whether I have to take into consideration "collection size" also. Are there any size constraints on collections?
It is really highly dependent on the documents. If the next approximately 2 million documents are very similar to your current 78k documents, then your estimate is probably close. Keep in mind however, that it is recommended to maintain on-disk free space of 1.5X your database size to account for merging overhead.
Collections are like metadata "tags". There is negligible storage overhead in applying collections to documents, and there are no size constraints specifically associated with collections.