BizTalk (2020 CU3) SFTP GET (with WinSCP) fails

Advertisement

jonasgr
Joined:
Posts:
4
Location:
Sweden

BizTalk (2020 CU3) SFTP GET (with WinSCP) fails

This error is intermittent but given time it happens frequently. The cause could be BizTalk or Windows Server 2019, but I guess the cause is WinSCP. We use version 5.19.2.0 of WinSCP.

When the problem occurs no error/warning/information is written in Windows Event log or anywhere else. The only way I see the error is that the WinSCP processing stops. Here's an extract from the log file. The problem is identified by rows missing at the end:
. 2022-02-02 06:56:05.793 Listing directory "/outgoing/DIRECTORY".
> 2022-02-02 06:56:05.793 Type: SSH_FXP_OPENDIR, Size: 26, Number: 48651
< 2022-02-02 06:56:05.840 Type: SSH_FXP_HANDLE, Size: 10, Number: 48651
> 2022-02-02 06:56:05.840 Type: SSH_FXP_READDIR, Size: 10, Number: 48908
< 2022-02-02 06:56:05.871 Type: SSH_FXP_NAME, Size: 158, Number: 48908
> 2022-02-02 06:56:05.871 Type: SSH_FXP_READDIR, Size: 10, Number: 49164
< 2022-02-02 06:56:05.918 Type: SSH_FXP_STATUS, Size: 22, Number: 49164
< 2022-02-02 06:56:05.918 Status code: 1
> 2022-02-02 06:56:05.918 Type: SSH_FXP_CLOSE, Size: 10, Number: 49412
. 2022-02-02 06:56:05.918 Example_file_220202001000+0100.csv;-;317155;2022-02-01T23:10:09.000Z;3;"user" [0]
   ;"group" [0];rw-rw-rw-;1
. 2022-02-02 06:56:05.918 ..;D;0;1899-12-30T01:00:00.000Z;0;"" [0];"" [0];---------;0
< 2022-02-02 06:56:05.918 Script: -rw-rw-rw-   1 user     group       317155 Feb  2  0:10:09 2022 
   Example_file_220202001000+0100.csv
< 2022-02-02 06:56:05.918 Script: D---------   0                           0              ..
This is written when it fails and the found file is not fetched.
This output is identical to when WinSCP does not find any file, but as you can see above the file Example_file_220202001000+0100.csv was found.

During a successful session (when the file is found and fetched), 63 additional rows are written to the log file. I'll submit these additional log rows from a successful GET-statement, this is how the log file ought to continue:
> 2022-01-31 15:15:30.641 Script: get  -nopermissions -preservetime -transfer="binary"  -- "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv" "C:\Users\BizTalkUser\AppData\Local\Temp\acfc2aaf-967c-4bae-a6b0-e45eddc85049\b152d774-2987-4c17-bd98-981fde9cc9e3"
. 2022-01-31 15:15:30.641 Listing file "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv".
> 2022-01-31 15:15:30.641 Type: SSH_FXP_LSTAT, Size: 58, Number: 1287
< 2022-01-31 15:15:30.703 Type: SSH_FXP_STATUS, Size: 21, Number: 1028
. 2022-01-31 15:15:30.703 Discarding reserved response
< 2022-01-31 15:15:30.750 Type: SSH_FXP_ATTRS, Size: 29, Number: 1287
. 2022-01-31 15:15:30.750 Example_file_220131143718+0100.csv;-;316085;2022-01-31T13:37:24.000Z;3;"" [0];"" [0];rw-rw-rw-;1
. 2022-01-31 15:15:30.750 Copying 1 files/directories to local directory "C:\Users\BizTalkUser\AppData\Local\Temp\acfc2aaf-967c-4bae-a6b0-e45eddc85049\" - total size: 316 085
. 2022-01-31 15:15:30.750   PrTime: Yes; PrRO: No; Rght: rw-r--r--; PrR: No (No); FnCs: N; RIC: 0100; Resume: S (102400); CalcS: No; Mask: b152d774-2987-4c17-bd98-981fde9cc9e3
. 2022-01-31 15:15:30.750   TM: B; ClAr: No; RemEOF: No; RemBOM: No; CPS: 0; NewerOnly: No; EncryptNewFiles: Yes; ExcludeHiddenFiles: No; ExcludeEmptyDirectories: No; InclM: ; ResumeL: 0
. 2022-01-31 15:15:30.750   AscM: *.*html; *.htm; *.txt; *.php; *.php3; *.cgi; *.c; *.cpp; *.h; *.pas; *.bas; *.tex; *.pl; *.js; .htaccess; *.xtml; *.css; *.cfg; *.ini; *.sh; *.xml
. 2022-01-31 15:15:30.750 File: '/outgoing/DIRECTORY/Example_file_220131143718+0100.csv' [2022-01-31T13:37:24.000Z] [316085]
. 2022-01-31 15:15:30.750 Copying "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv" to local directory started.
. 2022-01-31 15:15:30.750 Binary transfer mode selected.
. 2022-01-31 15:15:30.766 Checking existence of partially transferred file.
. 2022-01-31 15:15:30.766 Opening remote file.
> 2022-01-31 15:15:30.766 Type: SSH_FXP_OPEN, Size: 66, Number: 1539
< 2022-01-31 15:15:30.953 Type: SSH_FXP_HANDLE, Size: 10, Number: 1539
> 2022-01-31 15:15:30.953 Type: SSH_FXP_FSTAT, Size: 10, Number: 1800
< 2022-01-31 15:15:31.047 Type: SSH_FXP_ATTRS, Size: 29, Number: 1800
> 2022-01-31 15:15:31.047 Type: SSH_FXP_READ, Size: 22, Number: 2053
< 2022-01-31 15:15:31.266 Status code: 1
. 2022-01-31 15:15:31.266 30 skipped SSH_FXP_WRITE, SSH_FXP_READ, SSH_FXP_DATA and SSH_FXP_STATUS packets.
> 2022-01-31 15:15:31.266 Type: SSH_FXP_CLOSE, Size: 10, Number: 7172
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 4869
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 5125
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 5381
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 5637
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 5893
< 2022-01-31 15:15:31.266 Type: SSH_FXP_STATUS, Size: 22, Number: 6149
< 2022-01-31 15:15:31.281 Type: SSH_FXP_STATUS, Size: 22, Number: 6405
< 2022-01-31 15:15:31.297 Type: SSH_FXP_STATUS, Size: 22, Number: 6661
< 2022-01-31 15:15:31.422 Type: SSH_FXP_STATUS, Size: 22, Number: 6917
. 2022-01-31 15:15:31.422 Preserving timestamp [2022-01-31T13:37:24.000Z]
. 2022-01-31 15:15:31.422 Transfer done: '/outgoing/DIRECTORY/Example_file_220131143718+0100.csv' => 'C:\Users\BizTalkUser\AppData\Local\Temp\acfc2aaf-967c-4bae-a6b0-e45eddc85049\b152d774-2987-4c17-bd98-981fde9cc9e3' [316085]
. 2022-01-31 15:15:31.422 Copying finished: Transferred: 316 085, Elapsed: 0:00:00, CPS: 905 166/s
> 2022-01-31 15:15:33.609 Script: stat -- "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv"
. 2022-01-31 15:15:33.609 Listing file "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv".
> 2022-01-31 15:15:33.609 Type: SSH_FXP_LSTAT, Size: 58, Number: 7431
< 2022-01-31 15:15:33.609 Type: SSH_FXP_STATUS, Size: 21, Number: 7172
. 2022-01-31 15:15:33.609 Discarding reserved response
< 2022-01-31 15:15:33.703 Type: SSH_FXP_ATTRS, Size: 29, Number: 7431
. 2022-01-31 15:15:33.703 Example_file_220131143718+0100.csv;-;316085;2022-01-31T13:37:24.000Z;3;"" [0];"" [0];rw-rw-rw-;1
< 2022-01-31 15:15:33.703 Script: -rw-rw-rw-   0                      316085 Jan 31 14:37:24 2022 Example_file_220131143718+0100.csv
> 2022-01-31 15:15:33.734 Script: rm  -- "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv"
. 2022-01-31 15:15:33.734 Listing file "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv".
> 2022-01-31 15:15:33.734 Type: SSH_FXP_LSTAT, Size: 58, Number: 7687
< 2022-01-31 15:15:33.813 Type: SSH_FXP_ATTRS, Size: 29, Number: 7687
. 2022-01-31 15:15:33.813 Example_file_220131143718+0100.csv;-;316085;2022-01-31T13:37:24.000Z;3;"" [0];"" [0];rw-rw-rw-;1
. 2022-01-31 15:15:33.813 Deleting file "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv".
> 2022-01-31 15:15:33.813 Type: SSH_FXP_REMOVE, Size: 58, Number: 7949
< 2022-01-31 15:15:33.969 Type: SSH_FXP_STATUS, Size: 21, Number: 7949
< 2022-01-31 15:15:33.969 Status code: 0
< 2022-01-31 15:15:33.969 Script: /outgoing/DIRECTORY/Example_file_220131143718+0100.csv
All configuration is identical between the failed and the successful fetch.
When I restart the process that runs WinSCP (the BizTalk host instance) the file is fetched.

Could the cause be a full pool? Problems with multiple WinSCP sessions being executed in parallel (we have >10)
A maximum number of sessions? The host server should not pose a problem since the SFTP session logs on and can list the content, and SFTP transfers occur on the same channel; but does something local limits WinSCP execution?

What in WinSCP happens between the log statements:
< 2022-02-02 06:56:05.918 Script: D---------   0                           0              .. 
and
> 2022-01-31 15:15:30.641 Script: get  -nopermissions -preservetime -transfer="binary"  -- "/outgoing/DIRECTORY/Example_file_220131143718+0100.csv" "C:\Users\...\Temp\acfc2aaf-967c-4bae-a6b0-e45eddc85049\b152d774-2987-4c17-bd98-981fde9cc9e3" 
Whatever happens in the code here can hopefully pinpoint the problem.

Reply with quote

Advertisement

martin
Site Admin
martin avatar
Joined:
Posts:
41,518
Location:
Prague, Czechia

Re: BizTalk (2020 CU3) SFTP GET (with WinSCP) fails

Please attach complete logs for both scenarios.
Did you check in the Task Manager, if the WinSCP process got terminated or if it did hang?
How exactly is BizTalk running WinSCP? Are you able to use a different version of WinSCP, if needed for debugging the issue?

Reply with quote

jonasgr
Joined:
Posts:
4
Location:
Sweden

Re: BizTalk (2020 CU3) SFTP GET (with WinSCP) fails

Please attach complete logs for both scenarios.
Both logs are attached! Complete for one failed and one successful fetch.

Did you check in the Task Manager, if the WinSCP process got terminated or if it did hang?
WinSCP isn’t shown as a running process since it’s loaded by the BizTalk service named BTSNTSvc64.exe (when running as 64bit).

I'm pretty sure it hangs since there is no error in the Event Log, and we specify a poll interval in BizTalk. The process keeps polling, just never "understands" that it has found a file that should be fetched.

How exactly is BizTalk running WinSCP?
The BizTalk process uses WinSCP.exe and WinSCPnet.dll. Exactly how, I don’t know.

We use version 5.19.2 as listed at the bottom of https://support.microsoft.com/en-us/topic/kb5007969-cumulative-update-3-for-microsoft-biztalk-server-2020-e0d31b4d-460f-4b2c-b53b-e72218401897

Are you able to use a different version of WinSCP, if needed for debugging the issue?
Yes, I believe we can use a different version of WinSCP. If the versions and file names of the EXE and DLL are the same, the effort to replace them is quick but this is possible with different versions as well by adding configuration as described under the heading “Use a newer WinSCP version” on https://learn.microsoft.com/en-us/biztalk/core/sftp-adapter

More information
We initially ran all SFTP retrieve-procedures under 1 BizTalk process (10 to 15 retrieve procedures with WinSCP). We now created five more BizTalk processes, so 1 BizTalk process does not host more than 3 WinSCP processes, and now the problem seems to be fixed. This is however not a long-term fix but might give clues about the problem.
Description: Unsuccessful fetch! This can easily be identified since the log file stops at row 119, while the successful fetch continues to row 180. This log file starts to differ with the successful fetch on row 111. Row 112 ought to log:
SSH_FXP_READDIR
Description: Successful fetch! This can easily be identified since we have the log statements:
Transfer done (row 154)
Deleting file (row 169)

Reply with quote

jonasgr
Joined:
Posts:
4
Location:
Sweden

Re: BizTalk (2020 CU3) SFTP GET (with WinSCP) fails

The problem seems to be fixed when we're running a maximum of three WinSCP jobs per BizTalk process. Some more tests seem to point towards a maximum of ten WinSCP jobs per BizTalk process; the 11:th will cause the error.

Decompiling the BizTalk process does not show an obvious limit of 10 threads, so I don't know if this is a code issue in WinSCP, BizTalk, or a system configuration question.

I hope help from you can help me post a bug report to Microsoft if you/we come to the conclusion that WinSCP isn't the cause, and if you can supply a version of the DLL and EXE with more debug information I would be happy to run it.

Reply with quote

Advertisement

You can post new topics in this forum