Reclaim Disk Space
After deleting data from a MongoDB instance, the storage space used by these deleted data will be marked as free, and subsequently, new data written by the same collection is usually stored directly in this free storage space. However, this free storage space cannot be reused by other collections. These unused free storage spaces are referred to as disk fragments. The more disk fragments there are, the lower the disk utilization.
There are two methods for reclaiming disk space: using the compact command and rebuilding data files.
-
compactcommand: This is a collection-level operation and requires collection-by-collection compression. -
Rebuild data files: This is a database instance-level operation, performed on the entire database, and is generally more comprehensive.
compact
Precautions
-
Please make sure to have a complete backup of the database first.
-
For versions prior to MongoDB 4.4, executing the
compactcommand may cause the database associated with the collection to be locked, and read and write operations on that database will be blocked. It is recommended to perform this operation during off-peak business hours or after upgrading the version. For more details on the blocking issue, refer to the MongoDB official documentation.-
The time required to reclaim disk fragments using the
compactcommand is related to the data volume of the collection, system load, disk performance, etc. During execution, there will also be a certain increase in CPU and memory usage. -
For versions below MongoDB 4.4.9, nodes currently executing the
compactcommand will be forced into RECOVERING state. If this state persists for an extended period, the node may no longer be able to synchronize with thePRIMARYnode's data. -
For versions between MongoDB 4.4.9 and 4.4.17, nodes executing the
compactcommand will remain inSECONDARYstate but will still be unable to synchronize with thePRIMARYnode's data. -
For versions above MongoDB 4.4.17, when executing the
compactcommand,SECONDARYnodes will continue to replicate data from thePRIMARYnode. (It is recommended to execute thecompactcommand on versions above MongoDB 4.4.17)
-
-
The following conditions may cause the
compactcommand to be ineffective, for more details please refer to the open-source code.-
The size of the physical collection is less than 1 MB.
-
In the first 80% of the storage space in a file, the amount of free storage space is less than 20%; in the first 90% of the storage space in a file, the amount of free storage space is less than 10%.
-
-
When executing the
compactcommand, it is possible that the released storage space is less than the free storage space. If this occurs, you can try to repeat thecompactcommand to release disk fragments, but it is not recommended to execute thecompactcommand frequently.
Estimated Reclaimed Disk Fragment Space
-
Switch the database to the database where the collection is located.
use database_namedatabase_nameis the name of the database where the collection is located.
-
View the disk fragment space to be reclaimed for the collection.
db.collection_name.stats().wiredTiger["block-manager"]["file bytes available for reuse"]collection_nameis the name of the collection.
The returned result is as follows:
1485426688This result indicates that the estimated disk fragment space to be reclaimed is 1485426688 bytes.
Reclaim Disk Fragments for Single Node or Replica Set Instances
Single Node
A single node instance has only one node, so you only need to execute the compact command for this instance.
Replica Set
Replica set instances have multiple nodes, follow the following steps:
-
Execute the
compactcommand on one of theSECONDARYnodes. After thecompactcommand is completed, repeat this operation on each remainingSECONDARYnode in sequence. -
Reassign the primary node. Use the
rs.stepDown()method on thePRIMARYnode to trigger the re-election of a newPRIMARYnode. Once thePRIMARYnode changes toSECONDARYstatus and a newPRIMARYnode is successfully elected, then execute thecompactcommand.-
If you need to force the execution of the
compactcommand on thePRIMARYnode, you will need to add theforceparameter, for example:db.runCommand({compact:"collection_name",force:true})
-
compact Operation
-
Connect to the database node using the Mongo Shell.
-
Switch the database to the database where the collection is located.
use database_namedatabase_nameis the name of the database where the collection is located.
-
Specify the collection to execute the
compactcommand and reclaim disk fragments.db.runCommand({compact:"collection_name"})collection_nameis the name of the collection.
If successful, the return result is as follows:
{ "ok" : 1 }
Rebuild Data Files
Precautions
-
Before proceeding, please make sure to have a complete backup of the database.
-
The time required to rebuild data files depends on the data volume of the collection, system load, disk performance, etc.
Single Node
-
Stop the application service.
-
Stop the MongoDB database.
-
Use the
--repairparameter ofmongodto rebuild data files and reclaim disk space.Example:
mongod --repair --dbpath /data/mongodb/-
/data/mongodb/is the MongoDB data storage directory. -
Do not interrupt the operation during execution, as it may affect data integrity and prevent the database from starting.
-
-
Start the MongoDB database.
Replica Set
Reclaim disk space by deleting data on SECONDARY nodes and leveraging MongoDB replica set's internal resynchronization mechanism to rebuild data files.
-
Execute the following command on any
SECONDARYnode to delete the data on the current node (excluding thekeyfile):find /data/mongodb/ -mindepth 1 ! -name 'keyfile' -exec rm -rf {} +- This command excludes the
keyfilefile in the/data/mongodb/directory and deletes all other files and subdirectories.
- This command excludes the
-
Restart the current MongoDB node
-
Use the
rs.status()command to check the node status. During the synchronization process, the node status will display asSTARTUP2, and once synchronization is complete, it will change toSECONDARY. -
After the previous node has completed synchronization and the node status changes to
SECONDARY, repeat the same operation on the remainingSECONDARYnodes in sequence. -
Finally, on the
PRIMARYnode, use thers.stepDown()method to trigger the re-election. When thePRIMARYnode changes toSECONDARYstatus and successfully elects a newPRIMARYnode, you can perform the same operation on that node.