While backing up my ESXi 4.1 cluser, the BE media server will occassioanlly freeze up completely: the console doesn't respond past pressing C+A+D, can't RDP or access services/file shares; there's nothing useful in the windows event logs. The freeze occurs approximately once every two-three weeks, usually when the full backups are running (ie over the weekend).
I am backing up our production VMs using NBD with GRT to local deduplication storage on a rotation of weekday incremental / weekly full with 52 week retention period - full is then deduplicated to tape.
I've narrowed down the probable cause to a particular VM (file server that hosts our roaming profiles): I can see this VM snapshot created in VMWare, but not removed because BE has stalled. Media server is running Server 2008R2 with BE2012 SP3 + any updates published to LU. I've tried repairing the media server's installation, as seems to be necessary following the installation of every other Symantec updates. I've also run a checkdisk within the VM, and deleted the snapshot left behind by the failed job.
I could try backing up using the agent within the VM, but I'd rather have it included in the job with the rest of the cluster. Any suggestions on how I can troubleshoot further? Thanks