Internal editor changing text encoding type - bug

Advertisement

JustWebGuy
Joined:
Posts:
5

Internal editor changing text encoding type - bug

I have been editing HTML UTF-8 files from the remote server for a long time and had no issue until recently, like v6.5.0/2 time frame. Now when I open the file, make a modification, save, reload in the browser, I get ?-diamonds around the edit. So I suspect there is a regression in properly setting encoding type in the editor.

The editor shows under Encoding "1252". Since I didn't have a problem before, I never checked this until now, so I can't say what it said before.

Windows 10, files created via Save As HTML from LibreOffice Writer.

Reply with quote

Advertisement

JustWebGuy
Joined:
Posts:
5

And, if you try to change the encoding after saving in the editor, you get:
---------------------------
Error
---------------------------
Error loading file 'C:\Users\justin\AppData\Local\Temp\scp51482\home3\bereanb2\public_html\1Corinthians\notes\609-610_1Corinthians_7-27-25.html' using 'UTF-8' encoding.
---------------------------
OK Help
---------------------------

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
42,349
Location:
Prague, Czechia

The file has UTF-8 encoding.
So you need to use that when editing the file.
Once you have modified and saved the file using different encoding (Ansi), it becomes corrupted and trying to reload it with UTF-8 won't help.
If all your files are UTF-8, please configure it as the Default encoding in preferences.

Reply with quote

Advertisement

JustWebGuy

Yes, I already stated the files were UTF-8. Behavior seems to have recently changed; that is why I reported this issue.

I don't know which file (of all the files that I edit) is what encoding. How can I know before editing? Why doesn't the editor detect and set appropriately?

Reply with quote

martin
Site Admin
martin avatar

After some tests, I believe the encoding autodetection works fine, when the default encoding is configured as UTF-8. So all I've done for now is to default to UTF-8 since the next major release.
For you, it should be enough, if you configure the UTF-8 as default manually in the current version of WinSCP. Let me know.

Reply with quote

Advertisement

Guest

I'm not clear on "encoding autodetection works fine, when the default encoding is configured as UTF-8." You mean when UTF-8 is default, it accurately detects non-UTF-8 encoded files?

I may or may not be opening a UTF-8 file (and there is no way to tell before opening), but I went ahead and set to default as I believe that will be most common for me.

Reply with quote

martin
Site Admin
martin avatar

Yes. In general, WinSCP first tries to load the file using the default encoding and if that fails, it falls back to the other. But the problem is that any file can be loaded with Ansi encoding, so if you have Ansi set as the default, fall back to UTF-8 cannot happen. But if you have UTF-8 encoding set as the default, WinSCP can fallback to Ansi, if the file is not valid UTF-8 file.

Reply with quote

Advertisement

You can post new topics in this forum