UTF-8 directory listings broken

Advertisement

honzo
Guest

UTF-8 directory listings broken

Hi,

I have several directories on a Debian 'testing' system [openssl 0.9.7g-2, ssh 3.8.1p1-8.sarge.4, server runs in en_US.UTF-8 locale] containing several directory names with Czech national characters in UTF-8 enconding. Every time when I try to list those directories, I get complaints about an error when decoding UTF-8 text and only a part of the directory contents is displayed.

I have not checked the directory for improper UTF-8 characters, but I think that even if there are some inconsistencies, WinSCP should be able to handle such cases without messing up the directory listing. Moreover, I can see these listings properly in FileZilla, putty (Unix shell 'ls' output), and when accessing these directories as Samba 3 shares. From what I have seen, I am not aware of any suspicious characters that would spoil the listing.

I have first encountered this bug with Servant Salamander beta 10 WinSCP plugin, just installed WinSCP 3.7.6 and it does the same thing. I have just downgraded to WinSCP 3.6.8 and it works like charm ...

-- jan

Reply with quote

Advertisement

martin
Site Admin
martin avatar
Joined:
Posts:
40,567
Location:
Prague, Czechia

Re: UTF-8 directory listings broken

Did you enabled UTF-8 support? WinSCP should not use UTF-8 with OpenSSH by default. OpenSSH does not support it.

Reply with quote

honzo
Guest

Re: UTF-8 directory listings broken

Yep. No chance to get it working without forcing UTF-8 as OpenSSH supports only some archaic version of SFTP, AFAIK.

Meanwhile I tried WinSCP 3.8.2 build 330, it performs slightly better: It displays the Unicode characters without complaints, but it still goes ballistic in case it encounters something that is not UTF-8 (like, for example, filename containing ä in CP-1250, uploaded by some ignorant moron). I know that this is more a question of design, but I would still consider it more appropriate to just issue a warning and mask the illegal character with underscore or whatever. This way users will be able to download the file, albeit with crippled filename, even if they forced UTF-8.

Actually, this is another important note: If I prevent WinSCP from using UTF-8, it will show the directory name containing (quite naturally) gibberish instead of Unicode letters, but it downloads the file, containing non-ASCII and non-UTF-8 characters (my favourite ä in CP-1250) without complaints (again, quite naturally, as no character translation takes place).

-- jan

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
40,567
Location:
Prague, Czechia

Re: UTF-8 directory listings broken

honzo wrote:

I know that this is more a question of design, but I would still consider it more appropriate to just issue a warning and mask the illegal character with underscore or whatever. This way users will be able to download the file, albeit with crippled filename, even if they forced UTF-8.
Unfortunately WinSCP does not work internally in Unicode, so it cannot keep the original filename. Thus implementing your suggestion would imply major change. But it would be possible to simply skip the files with non-UTF8 names.

Edit 2021: WinSCP is a Unicode application for a long time now,so the above is not an issue.
https://winscp.net/tracker/586
For other similar problems, see:
https://winscp.net/eng/docs/faq_utf8

Reply with quote

Advertisement

You can post new topics in this forum