Log Data Cleanup
Overview
In the HAP private deployment system, some "log" type data is retained for a long time in MongoDB, which can lead to large volumes of data in certain usage scenarios, occupying significant database storage space.
You can use the show dbs command in MongoDB to check the size of each database, and then use the command to calculate table size to find tables that occupy significant storage space.
We provide a log data cleanup solution that allows for physical deletion of data from relevant tables according to specified rules.
Important Notice Before Operation
The cleanup operation is a physical deletion, meaning the corresponding data is permanently lost. Once completed, the log data for the corresponding time period cannot be recovered or viewed on the system interface.
Impact Scope:
-
Tables in the
mdworkflowdatabase that can be cleaned mainly affect:- Workflow execution history
- Approval flow history
- Interruptions in running processes: If the data being cleaned includes approval flows or workflows that are not yet completed, these processes will be interrupted due to data loss and cannot continue execution.
-
Tables in the
mdworksheetlogdatabase mainly affect:- Worksheet row record logs
-
Tables in the
mdintegrationdatabase mainly affect:- Integration center history request logs
-
Tables in the
mdservicedatadatabase mainly affect:- Application behavior logs
- Usage analysis logs
Data Cleanup Whitelist
| Database | Table Name | Table Usage Description |
|---|---|---|
| mdworkflow | wf_instance | Main workflow execution history associated data |
| mdworkflow | wf_subInstanceActivity | Subprocess execution history associated data |
| mdworkflow | wf_subInstanceCallback | Subprocess execution history associated data |
| mdworkflow | wf_instanceExtends | Workflow execution history associated data |
| mdworkflow | wf_instanceHistory | Workflow node logs, by default, only retain data from the last 90 days. |
| mdworkflow | code_catch | Stores temporary data generated during runtime by code block nodes |
| mdworkflow | hooks_catch | Stores received Webhook data |
| mdworkflow | webhooks_catch | Stores data obtained by workflow nodes "send API requests" |
| mdworkflow | app_multiple_catch | Stores data obtained by selecting "direct access" in multi-data nodes |
| mdworkflow | custom_apipackageapi_catch | Stores response data returned by API integration calls |
| mdworksheetlog | wslog* | Stores worksheet row record logs for corresponding month Table naming format is wslog + date (e.g., wslog202409) |
| mdintegration | wf_instance | Integration center - request logs |
| mdintegration | wf_instance_relation | Integration center - request log associated data |
| mdintegration | webhooks_catch | Integration center - log data corresponding to "view details" in request logs |
| mdintegration | code_catch | Integration center - log data corresponding to "view details" in request logs |
| mdintegration | json_catch | Integration center - log data corresponding to "view details" in request logs |
| mdintegration | custom_parameter_catch | Integration center - log data corresponding to "view details" in request logs |
| mdservicedata | al_actionlog* | Stores application behavior logs for corresponding month Table naming format is al_actionlog + date (e.g., al_actionlog202409) |
| mdservicedata | al_uselog | Stores log data for "usage analysis" feature |
Data Cleanup Suggestions
Below are specific cleanup methods and considerations for different types of data.
Log Tables Archived Monthly
This method is suitable for log tables that are automatically created monthly.
-
Applicable Tables:
- All tables in the
mdworksheetlogdatabase starting withwslog - All tables in the
mdservicedatadatabase starting withal_actionlog
- All tables in the
-
Operation Method: The most direct and efficient way to clean these monthly archived tables is to use the
dropcommand to delete the entire table. This operation is extremely fast and immediately frees up all disk space occupied by the table. -
Operation Example: The following command will delete the worksheet log for January 2024 in the mdworksheetlog database.
use mdworksheetlog;
db.wslog202401.drop();