paused filepart is deleted and restarted from 0 on resume

Advertisement

dmitry_20220518
Joined:
Posts:
8

paused filepart is deleted and restarted from 0 on resume

Using WinSCP 5.21.2 (and earlier)

This doesn't happen always, and I can't quite figure out what causes it -- hopefully you can.

I start gzipping a large file 200GB on the remote Linux box, which takes multiple hours to gzip down to 80-100GB

While it's zipping, I start downloading it -- and on most days, it's completed before my 09:30AM cutoff time (before I have to pause it) -- slow VPN

But on days like today, I kicked the download off a bit late the evening before, so I had to pause it at 09:25 by hitting the || pause button

The filepart file was 53GB and sat around all day being that size.

At 16:30, I went to resume the download -- but when I did, the filepart file was reset back to 0 byte and it started re-downloading the entire 100GB file again.

This happens quite often, but not always -- I've had days when resuming it would continue where it left off.

Funny enough if I pause it and then kill WinSCP, and bring it up later -- it almost always picks up the filepart and continues -- it's primarily when I pause/unpause it without killing.

I thought that maybe my VPN disconnects and reconnects midday, or the pausing causes something to be lost or a checksum to be failed, or some other issue

A

Reply with quote

Advertisement

martin
Site Admin
martin avatar
Joined:
Posts:
41,276
Location:
Prague, Czechia

Re: paused filepart is deleted and restarted from 0 on resume

Please attach a full session log file showing the problem (using the latest version of WinSCP).

To generate the session log file, enable logging, log in to your server and do the operation and only the operation that causes the error. Submit the log with your post as an attachment. Note that passwords and passphrases not stored in the log. You may want to remove other data you consider sensitive though, such as host names, IP addresses, account names or file names (unless they are relevant to the problem). If you do not want to post the log publicly, you can mark the attachment as private.

Reply with quote

dmitry_20220518
Joined:
Posts:
8

absolutely -- I turned on "Debug 1" logging for next time -- not sure if/when it'll happen again, but if it does, I'll send you the log

It's almost like the issue is this scenario, as it seems to not exist (IIRC) when the program is restarted after the pause.

1. Starts downloading a file with filesize X
2. Continues downloading it as filesize keeps growing to it's final size of, say, 100X
3. Manually Paused when the file is complete but not fully downloaded (I have a slow VPN)
4. Manually Resumed after a sufficiently long time (in my case, about 7 hours, but have not tried shorter times)

Reply with quote

dmitry_20220518
Joined:
Posts:
8

Able to reproduce

Was able to repro it:

Attached is a zip at Debug=1 with about 8 rotated log files.

By the time I zipped it, the interesting lines moved to MD5.log.6,so look at that fie -- I had hit pause originally around 08:13, and then hit resume around 16:52. I believe the interesting lines are around "2022-09-01 16:52:26.433" on line 74557, and my file I'm downloading is on line 74858.

Had a 63GB local partfile when I clicked resume, then it went to 0 -- the original file is 100GB

I don't have all log files from when I brought WinSCP up for last night's download, but I go up to MD5.log.30 from about an hour earlier -- if you need them later and if you think they'll help
  • 113299.zip (2.08 MB, Private file)

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
41,276
Location:
Prague, Czechia

Re: Able to reproduce

Important is this entry:
. 2022-09-01 16:52:37.586 [Background 1] Partially transferred file is bigger that original file.
(I'm aware of the typo, will fix that)
Does it make any sense? Is it true, what the message says?

Reply with quote

Advertisement

dmitry_20220518
Joined:
Posts:
8

That would only be true if WinSCP had cached the original size somewhere (like if it cached the growing file's size when it was smaller and tried to compare it much later instead of re-reading the new value)

By the time I hit resume, the local file was 6xGB and the remote file was 9xGB

So there are only a few ways I can see the local 6xGB file being larger than the remote:
  1. WinSCP caching an old, smaller value (most likely to me, if only this never happens if I restart WinSCP, but only when I resume a long-running WinSCP without restarting)
  2. Somehow the Linux SFTP server giving a wrong file size and causing this
If you want to give me a binary with specific extra logging for my #1 case, where you will dump out what file size you think the local and remote files are, and maybe where you grabbed the remote file size, etc – I'll be happy to test it all week until we catch it

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
41,276
Location:
Prague, Czechia

At the time of the resume, the server claimed the file is 2.4 GB:
. 2022-09-01 16:52:37.586 [Background 1] File: '/mdata/hobo/20220831_tape.tape.gz' [2022-09-01T02:48:40.000Z] [2439249920]

Reply with quote

dmitry_20220518
Joined:
Posts:
8

Is it possible there's something not passed to the "[Background]" thread between these two lines from my file ?

So at this time, the response shows correct timestamp and file size:
2022-09-01 16:52:29.569 20220831_tape.tape.gz;-;96797290190;2022-09-01T07:23:53.000Z;4;"" [1001];"" [1001];rw-rw-r--;0
But 8 seconds later, on the Background line, it shows the stale data
2022-09-01 16:52:37.586 [Background 1] File: '/.../.../20220831_tape.tape.gz' [2022-09-01T02:48:40.000Z] [2439249920]

Reply with quote

martin
Site Admin
martin avatar

Both are direct responses from the server. Just using different API. The first is directory listing (SSH_FXP_OPENDIR/SSH_FXP_READDIR SFTP requests), while the latter is file attribute request (SSH_FXP_LSTAT). If you post Debug 1 log file, I can double check that, but I do not think it will bring anything new.

Reply with quote

Advertisement

dmitry_20220518
Joined:
Posts:
8

Those lines were from the log file I submitted – if they are both responses from the SFTP server, then maybe the SFTP server does have a bug.

I'll be away for a few weeks, I can pick this up again in October – as a work-around, I'll probably just restart WinSCP if I need to pause/resume on growable files.

Thanks for looking into it!!!

Reply with quote

martin
Site Admin
martin avatar

It might not be a bug of the SFTP server as such, it can just be the underlying file system reporting inconsistent sizes for opened files.

Reply with quote

Advertisement

You can post new topics in this forum