Hello community, I am have been troubleshooting a client issue for about 4 days and was looking to see if anyone would have any additional ideas on resolving this. I'll do my best as to help explain and give some background information:
Client: Windows 2003 x64 Enterprise with NBU v7.1.0.4 client installed (Physical server). Policy is a Windows filesystem policy with ALL_LOCAL_Drives (no multistreaming enabled)
3 Media & Master Server: Windows 2008 Standard x64 SP2 with NBU client v7.1.0.4 (*media and master servers are all separate physical servers totally 4)
Error occurs writing to either Tape or a Data Domain device
Brief timeline of events
- Friday Full and Monday differentail ran successfully
- Tuesday differential failed (server owner confirmed no changes made. The differential and full have consistently been failing with a Status: 24 Socket Write Failed.
- This appears in the bpbkar log:
1:25:59.101 PM: [19572.32028] <2> TransporterRemote::write[2](): DBG - | An Exception of type [SocketWriteException] has occured at: | Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321 | Local Address: [0.0.0.0]:0 | Remote Address: [0.0.0.0]:0 | OS Error: 10053 (An established connection was aborted by the software in your host machine.
) | Expected bytes: 16384 | (../TransporterRemote.cpp:321)
1:25:59.101 PM: [19572.32028] <16> tar_tfi::processException:
An Exception of type [SocketWriteException] has occured at:
Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321
Module: @(#) $Source: src/ncf/tfi/lib/Packer.cpp,v $ $Revision: 1.89 $ , Function: Packer::getBuffer(), Line: 656
Module: tar_tfi::getBuffer, Function: H:\7104\src\cl\clientpc\util\tar_tfi.cpp, Line: 312
Local Address: [0.0.0.0]:0
Remote Address: [0.0.0.0]:0
OS Error: 10053 (An established connection was aborted by the software in your host machine.
)
Expected bytes: 16384
and I also see this in bpbkar:
:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - FS_DleBEAO::DeInit - exiting.
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedssql2.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsshadow.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsss.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsadgran.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsnt5.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsev.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsxese.dll
1:26:00.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:01.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:02.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:03.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:03.194 PM: [19572.32028] <4> OVShutdown: INF - Shutdown wait finished
- This appears in the bpbkar log:
- The following troubleshooting :
- NBU client restarted
- Server was rebooted
- Added TcpTimedWaitDelay to 30 seconds, then reboot & rebooted http://www.symantec.com/business/support/index?page=content&id=TECH150369
- Open a case with Symantec support, tried increase Client connect timeout and client read timeout from 600sec to 1800 secs.
- Ran AppsCritical report and found ~21 % packet reordering &*** following up with Network team to determine any changes in network
- TCP Offload and Chimney on client were already disabled
- Confirmed NIC offload settings were disabled as well
- No NIC Teaming enabled
- Confirmed netstat -a does not exhibit large amount of timed_Wait
- Forward/Reverse DNS resolution fine
- Ping fine
- bpclntcmnds all working fine:
bpclncmd -ip --> from both client and server
* bpclntcmd -hn / * bpclntcmd -pn / run bpcoverage -c clientname
Does anyone have any other suggestions to help troubleshoot? Or am I missing anything??
Thank you for any help.