59141: Acronis Disaster Recovery Service: why it is important to monitor free space in your vaults

When you protect a server, you usually set a Retention policy for the archive. If the policy is set, the recovery console sends requests to the vaults based on the selected cleanup schedule, separately for each protected server.

Slices due to deletion are not deleted immediately, but marked as "to be deleted". The backup slice marked for deletion has a cross mark seen when opening any of the archives in Acronis management console. The actual deletion happens during the consolidation process which starts once the quantity of slices marked for deletion equals 24.

Consolidation is a resource-consuming operation similar to rebackup. Acronis software creates a new consolidated backup from the selected backups by calculating changes through the whole chain, and then deletes the selected backups. Consolidation process is a space-consuming operation since it removes initial backups only after the process completes.

That means that there must be as much free space in the vault as the biggest archive stream (full plus all dependent incrementals) plus a little more since other backups to other archives in the same vault could have been added until the quantity of slices marked for deletion reaches 24 and the consolidation process starts, proceeds and finishes.

For example, Vault01L has 1 Tb free space; the biggest archive is 1.3 Tb and contains 30 slices. Twenty of them are marked for deletion. Once the quantity of slices marked for deletion reaches 24, the next retention operation initiates the consolidation process which most probably will consume all space on the storage node. In this case all further backups to this vault fail due to insufficient free space. You will have to restart services, but the root cause of the issue is still left unresolved.

The resulted archive size cannot be calculated by summing up the sizes of the full backup and all non-marked-for-deletion incremental backups. Below is the explanation.

We use the “always incremental” scheme meaning that each incremental backup depends on the entire stream including all previous incrementals and the base full backup. The product works on the block level. Let’s simplify backup workflow by assuming that there is no compression or any other optimization.

1. We back up a drive which has 1 file, 10 bits in size. The base full backup size is 10 bits. The entire archive is 10 bits:

1 1 1 1 1 1 1 1 1 1

2. We add one 10-bit file. The next incremental backup size is 10 bits. The slice contains:

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

3. We add another 10-bit file. The next incremental backup size is 10 bits. The slice contains:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

4. We add 1 more 10-bit file. The next incremental backup size is 10 bits. The entire archive contains 4 slices: one full plus 3 incrementals. The 4th slice contains:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

5. We mark the second incremental slice for deletion and perform consolidation (in production, consolidation starts automatically after the quantity of backups marked for deletion equals 24). In our example, the 3rd slice depends on the second. But in case we're removing the 2nd slice, the 3rd is becoming dependent on the base full backup. Now the resulting archive contains 3 slices. The second incremental slice is 20 bits and contains:

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

6. The entire archive’s size is still 40 bits: 10 bits for the 1st full + 20 bits the second (incremental) + 10 bits the third slice (incremental).

In a productive environment, data changes are much more significant, the product works on block level and uses compression and optimization mechanisms, so the resulted archive shrinks in size. The exact value depends on the changes on block level, file types, file compression, the nature of changes (adding and removing similar or non-similar files). We cannot count the exact size of the resulted archive. Generally, it should be less than before the consolidation, but its size will be bigger than a simple sum of the full backup and all left slices.

We use Microsoft Storage Spaces technology and cannot make a vault larger than 15 Tb. In some cases, there is no free space on your LCA for vault expansion as well. Archive size should not exceed left free space in the local vault.

Possible solutions
If you need assistance, create a support ticket.

  1. We can remove the latest slices of the archive without running the consolidation process. If it allows us freeing up space for the consolidation, consider this possibility.
  2. We can move (export) some archives from the vault to another vault if there is one with enough free space, then run the consolidation, and then move those archives back, if needed.
  3. We can remove the whole archive and start a new backup stream from scratch.
  4. If you need to keep your data for some time, you need to allocate additional storage (NAS, NetApp, etc.). We can create a new managed or unmanaged vault, export the archive from the vault to the new storage, validate it to ensure the resulted archive is not corrupted, then clean up your vault and start a new backup stream from scratch with suitable retention rules. The consolidation runs during the export process, and the resulted exported archive will not contain slices marked for deletion and it will be smaller in size.

What to do next

  1. We can change the consolidation trigger from 24 to any other number, even to 1 to perform consolidation after every "delete backup" command issued by the recovery console according to the retention policy. If you have a very big file server, we recommend setting up a separate vault for it. However, that value can be estimated by checking the underlying storage performance and other activities taking place on the storage.
  2. We recommend monitoring free space and archives' sizes in your vaults. It can be done by scripts run with the command line utility. For example:
  3. Consider purchasing additional LCA to protect more servers, or decreasing retention rules for the troublesome server, or removing other servers from your backup plan to provide enough space for others.