67951: Effective validation and repair of TIBX/Archive3/Format12 backup archives using CLI tool archive_ctl

use Google Translate

    Last update: 09-02-2023

    Scenario

    You have a backup archive (TIBX) and operations with it (either continuing to back up to it, or restoring/extracting data from it) fail with errors that the archive is corrupted.

    Goal: You want to run diagnosis (validate) the archive to see what is damaged in it and to what extent, and, if possible, you want to repair the archive (at least partially).

    Note: The archive_ctl CLI tool, as well as some other CLI tools which are very useful in many situations can be downloaded from here or here: astor-1.14.zip

    When it comes to validation, exploration, and repair attempts of TIBX archives, archive_ctl is often superior to other approaches such as:

    - acrocmd

    - validate + auto-repair archive on Windows by setting environment variable:

     

    1. Press Win+R, type sysdm.cpl into the input field and press Ok. System properties window opens. Go to the Advanced tab and click on Environment variables
    2. In System variables section, click New
    3. Create a new variable with the following parameters:
      Variable name: ARCHIVE3_AUTO_REPAIR
      Variable value: 1
    4. Save changes and close System properties window
    5. In Protection Console, go to Backup Storage tab, select the archive that causes the error, and click Show backups. While accessing the backup, Agent will attempt to find the last good state and fix the archive.

    ... because using the archive_ctl tool provides much more direct info (outputs) about what is wrong with the archive and how wrong it is.

    Dealing with corrupted archives using archive_ctl can be done in several stages.

    Please note that the archive validations/inspection/repair attempts over an archive  require reading/scanning the archive several times end-to-end, therefore - if possible - when the archive is stored on a network share (e.g. SMB share), it is recommended to run the following operations from a location / machine on the LAN where connectivity to the share is high-speed, or, if the share is on a Windows box and not on some NAS appliance, to run the validation locally there to skip having to go over SMB protocol/network.

    If running from the machine where the agent is, or from any other machine on the LAN (i.e. NOT locally on the machine where the SMB share is), you'll need to specify the values for these parameters:

    Network login options:
      --network-login                        network login name
      --network-password                     network password
      
    ...or, alternatively, the network share can be mapped as a network drive like e.g. Z: and that path could be used.

    If the archive is on a cloud storage location (no matter partner's or in an Acronis DC), you'll need to specify the values for these parameters:

    --astor <host:port>                    connect to an astorage gw at host:port
      --cert <fname>                         use a client certificate in file fname

    The certificate is the cloud access certificate described in https://kb.acronis.com/content/60082

    Step 1A: Get basic details about the archive:

    archive_ctl --info --password <your_pass_for_the_archive> -f <full_absolute_path_to_archive>/<archive_name>.tibx > archive_info_log.txt

    Step 1B: Get some details about the archive:

    Relevant options/flags:

     -s, --slice-list                       list slices in archive
         --show-deleted                         show deleted slices
         --show-complete-only                   show only complete slices
         --show-hidden                          show hidden differential slices
         --reversed-order                       list slices from latest to oldest
         
    Command:
    archive_ctl --slice-list --show-deleted --show-hidden --reversed-order --password <your_pass_for_the_archive> -f  <full_absolute_path_to_archive>/<archive_name>.tibx > slice_list_all_log.txt

    Step 2: Run FULL (metadata + user data + sizes/refs/linkage of segments/objects and pointers):

    --validate-all operation will download (stream) all the archive data while validating; take this into consideration when validating a cloud archive

    Local/network storage: 

    archive_ctl --validate-all --password <your_pass_for_the_archive> -f <full_absolute_path_to_archive>/<archive_name>.tibx > validate_all_log.txt

    Cloud storage:

    archive_ctl --validate-all --astor <host:port> --cert <certificate.crt> --password <pass_for_the_archive> -f /1/<archive_name>.tibx > validate_all_log.txt
     

    Alternatively you may pipe all archive_ctl output to log file using combination of -d and --log switches (where -d stands for log level; 7 is max available log level). Having log file is useful as all output will be written to log that helps in troubleshooting

    archive_ctl --validate-all --astor <host:port> --cert <certificate.crt> --password <pass_for_the_archive> -f /1/<archive_name>.tibx -d 7 --log validate.log

    Step 3: Attempting to repair/fix (what can be fixed of) the archive:

    Options to attempt fix:

      --last-good                            find last good state in corrupted archive
         --last-good=fix                        if last good state was found perform metadata validation and fix archive
         --validate-all                         in addition to meta validate all data at last good archive state
         
    Recommended:
    Try fix with:
    --last-good=fix --validate-all

    Command:
    archive_ctl --last-good  --last-good=fix --validate-all  --password <your_pass_for_the_archive> -f <full_absolute_path_to_archive>/CAV-APSERVER4.caverton.local-2286BA7F-3F51-436E-B35F-53653383AC60-8D76B665-0887-496B-B0D6-B04899B6821BA.tibx > try_fix_all_log.txt

     

    NOTE: The syntax for the "last-good" family of commands is a bit special. You MUST start with --last-good as the 1st-level/leftmost switch (it tells the tool to search-for-consistent-points mode), and then with the 2nd-level --last-good=fix switch you tell the tool to go into rewrite mode. If you run "archive_ctl --last-good  ..." or "archive_ctl --last-good --validate-all ..." this will perform a dry-run/read-only search of the archive for valid points (valid slices).

    Step 4: run fsck / chkdsk to try to repair the underlying filesystem on which the network share resides

    preferably with full-check/surface-check and free-space checks (as much as the tool allows)

    NOTE: I did not recommend this as the first step, as this may further break or wipe the archive file - depending on many factors (what the underlying OS/FS are, what the fsck/chkdsk tool version does and how it behaves, what options are used for it...)

    Step 5: REFRESH

    After all repairs, REFRESH the storage location and REFRESH the list of slices (recovery points) in the archive that are shown in the Backup Console > Backup Storage web GUI.

    Then try using the archive/restore points.

    Further notes:

    The archive_ctl CLI tool, as well as some other CLI tools which are very useful in many situations can be downloaded from: https://dl.acronis.com/u/kb/622.zip

    NOTE: You may need to install / copy to the tool's unzipped directory the MS VisualC/C++ Runtime Library (MSVCRT120.dll). 

    MSVCR120.dll Missing - Microsoft Community
    https://answers.microsoft.com/en-us/windows/forum/windows_10-windows_ins...

    You can get the list of all options and flags which archive_ctl has by running: archive_ctl --help .

     

    The options / arguments / flags, as of 2021-01-25, are:

     

    ==============

    Example read-only/dry-run:

    Tags: