Post a reply

Before posting, please read how to report bug or request support effectively.

Bug reports without an attached log file are usually useless.

Options
Add an Attachment

If you do not want to add an Attachment to your Post, please leave the Fields blank.

(maximum 10 MB; please compress large files; only common media, archive, text and programming file formats are allowed)

Options

Topic review

martin

I do not disagree. But there's a little demand for such an option. Most servers either speak UTF-8 or use an Ansi encoding of their users' language.
dma_k


Again, WinSCP does not read the information from anywhere.

I see. I have read about OPTS UTF8 ON in this draft and it reads that it only tunes the control connection (e.g. CWD argument or PWD output). I am not expert on FTP protocol, but it seems like there is no agreement how data encoding send over data connection should be treated (at least for MLSD command output) – perhaps there is no way a server can tell a client what encoding it speaks.

But when it asks system to do the conversion, system uses the settings from Control panel→Region and Language→Administrative→Change system locale. Why do you think it's not good?

The description of this settings reads "Language for non-Unicode programs", hence it controls (perhaps among others) the set of glyphs to use in code positions 128-255 for console applications. WinSCP is Unicode program, hence it should not rely on this setting, or let's say, should fall back to it as a last resort. For example, NetBox (which is built in the top of WinSCP I believe) has a dedicated option for that.
martin

dma_k wrote:

I will appreciate if the message reads
Server does not send proper UTF-8, falling back to local charset CP1251

or whatever is configured for the system.

OK, will consider that. But please make sure you understand, that WinSCP does not actually know/care about that. It lets the system do the conversion. WnSCP does not need to know what the Ansi encoding really is.

Actually I am not sure where WinSCP takes this setting from... Is this a setting for console applications which don't support Unicode (Control panel→Region and Language→Administrative→Change system locale), or location (Control panel→Region and Language→Location)? Actually IMHO both of them are not good to use...

Again, WinSCP does not read the information from anywhere. But when it asks system to do the conversion, system uses the settings from Control panel→Region and Language→Administrative→Change system locale. Why do you think it's not good?

Does this message in log file appear when i.e. when client & server agreed to send data in UTF8? Like below:

WinSCP always assumes the server uses UTF-8, until it does not.
WinSCP sends OPTS UTF8 ON, only when it believes the server needs it to actually use UTF-8. If it is confident that the server uses UTF-8 even without that command, it won't send it.
dma_k

WinSCP actually logs both raw listing and parsed listing.

Thanks, I got the idea.
Server does not send proper UTF-8, falling back to local charset

I will appreciate if the message reads
Server does not send proper UTF-8, falling back to local charset CP1251

or whatever is configured for the system. Actually I am not sure where WinSCP takes this setting from... Is this a setting for console applications which don't support Unicode (Control panel→Region and Language→Administrative→Change system locale), or location (Control panel→Region and Language→Location)? Actually IMHO both of them are not good to use...

Does this message in log file appear when i.e. when client & server agreed to send data in UTF8? Like below:
< 2018-09-03 23:25:40.387 211-Features:

< 2018-09-03 23:25:40.398  UTF8
...
< 2018-09-03 23:25:40.409 211 End
> 2018-09-03 23:25:40.409 OPTS UTF8 ON
< 2018-09-03 23:25:40.412 200 UTF8 set to on
martin

Re: Log files (MLSD output) in UTF-8 encoding

WinSCP actually logs both raw listing and parsed listing.

WinSCP does not detect code page. It supports only UTF-8 and the legacy Ansi encoding, as configured in your system (what is CP1251 for your [russian?] system). If you have encoding set to default "Auto" in session settings, you should see a message like
Server does not send proper UTF-8, falling back to local charset
.
dma_k

Log files (MLSD output) in UTF-8 encoding

I am using WinSCP 5.13.4. Needless to say that Unicode support works fine as to directory listing & file transfer. However what is written to log file is like this:
< 2018-09-03 23:12:28.394 150 Opening ASCII mode data connection for MLSD

. 2018-09-03 23:12:28.397 Session ID reused
. 2018-09-03 23:12:28.425 Data connection closed
. 2018-09-03 23:12:28.426 modify=20180903211226;perm=adfrw;size=1521103;type=file;unique=3AU1934D;UNIX.group=100;UNIX.mode=0644;UNIX.owner=1000; Белоснежка и семь гномов.jpg
< 2018-09-03 23:12:28.426 226 Transfer complete
. 2018-09-03 23:12:28.426 Directory listing successful

One can see that is CP1251-encoded filename, which is then UTF-8 encoded. I think it is more practical to write to log already interpreted / decoded information, and not the raw server output. For example, PWD output is written in UTF-8 as I would expect:
> 2018-09-03 23:12:58.644 PWD

< 2018-09-03 23:12:59.159 257 "/ftp/images/Белоснежкa" is the current directory

I also wonder on which debug level auto-detected charset of FTP server is logged?