Why are text file line breaks wrong, after the file is transferred or edited?

After transferring or editing a file, it may happen that line breaks are wrong, what may manifest as:

Line breaks are lost. It seems like if a whole file content is on a single line.
Line breaks are duplicated. It seems like there’s additional empty line between every line.
There’s strange symbol/character at the end of every line.

This article explains possible causes of the problem and their solutions.

Text File Formats
Known Issues with Transfer Mode
Debugging Text File Conversion
- Requesting Support

Text File Formats

Different platforms (operating systems) use a different format of text files. The most common formats are Unix and Windows format. A primary difference is that different character or sequence of characters is used to signify an end of a line. On Unix, it’s LF character (\n, 0A or 10 in decimal). On Windows, it’s a sequence of two characters, CR and LF (\r + \n, 0D + 0A or 13 + 10 in decimal).

While many applications and systems nowadays can work with both formats, some require a specific format. When presenting a file in an another format, they fail to display it correctly, as described above.

<?xml version=“1.0” encoding=“UTF-8” standalone=“no”?> <!DOCTYPE properties SYSTEM “http://java.sun.com/dtd/properties.dtd”> <properties> <comment>システムプロパティ。</comment> <entry key=“auth-server-security”>simple</entry> <entry key=“displayed-number-of-records”>10</entry> <entry key=“prefetch_size”>100</entry> <entry key=“import_temp_file_path”>./webapps/dbace/WEB-INF/temp</entry> <entry key=“search_comparison_operator_order”>isNull,isNotNull,equal,notEqual,greaterEqual,lessEqual,greaterThan,lessThan,likeContains,likeStartsWith,likeEndsWith,notLikeContains,notLikeStartsWith,notLikeEndsWith</entry> <entry key=“audit_setting”>./webapps/dbace/WEB-INF/appresources/auditSetting.xml</entry> <entry key=“auth-server-timeout”>3000</entry> <entry key=“auth-server-protocol”/> <entry key=“auth-server-id”/> <entry key=“pulldown_max_count”>2000</entry> <entry key=“license-cnt”>Unlimited</entry> <entry key=“auth-server-base-domain”/> <entry key=“preview-column-count”>4</entry> <entry key=“import_maximum_byte_for_csv”>10</entry> <entry key=“download_maximum_byte_for_excel”>30</entry> <entry key=“search-max-record-count”>500000</entry> <entry key=“ext-auth”>OFF</entry> <entry key=“auth-connect-pwd”/> <entry key=“auth-server-type”/> <entry key=“download_maximum_byte_for_csv”>10</entry> <entry key=“download_fetch_size”>100</entry> <entry key=“import_maximum_byte_for_excel”>10</entry> <entry key=“auth-connect-username”/> <entry key=“search_condition_filepath”>./webapps/dbace/WEB-INF/user/searchcondition</entry> <entry key=“license-key”>0r/7wi7SsKFTwdq9kcui4LIuJDRYf5q/+OZ3Rt6aVXU=</entry> <entry key=“auth-server-user-identify”/> <entry key=“development_mode”>false</entry> <entry key=“app_repository”>./webapps/dbace/WEB-INF/appresources/repository.xml</entry> <entry key=“auth-server-port”/> <entry key=“import_maximum_byte_for_tsv”>10</entry> <entry key=“application_name”>DB エースメンテナンスエディション</entry> <entry key=“download_mode”>1</entry> </properties>

Known Issues with Transfer Mode

Pure-FTPd FTP server: When downloading a file with Windows line-endings (CR+LF) in a text/ASCII mode, the server replaces LF with CR+LF, resulting in an incorrect CR+CR+LF. When opening such file in an Internal editor of WinSCP, the editor interprets the sequence as two line endings (CR and CR+LF) resulting in a blank line after each and every content line. When the file is saved, the internal editor saves two Windows line endings CR+LF and CR+LF. On upload they get converted to two LF’s. A workaround is to use an external editor and make sure WinSCP does not force text mode for edited files.

Debugging Text File Conversion

If enabling (or disabling) text/ASCII transfer mode does not help with the problem and your transferred/edited file is still perceived incorrectly by the target system, you need to find out in what step the file got converted incorrectly (or haven’t got converted).

To detect line endings used by a file on Windows, use following command on PowerShell console to display hex dump of the first 100 characters of given file (example.txt):

Get-Content -Encoding Byte -TotalCount 100 example.txt |% {Write-Host ("{0:x2} " -f $_) -NoNewline}; Write-Host

For a file with following contents in a Windows format

One
Two

it displays:

4f 6e 65 0d 0a 54 77 6f 0d 0a

Note the two sequences 0d 0a (CR + LF) indicating Windows format.

To detect line endings used by a file on Unix/Linux system use command:1

xxd example.txt | head

For the same file as above, just in Unix format, it displays:

0000000: 4f6e 650a 5477 6f0a                      One.Two.

Note the character 0a (LF) indicating Unix format.

If you do not have a shell access to the remote system, download the file using binary encoding and use the PowerShell command on a local binary-identical copy.

Use these techniques to detect, what format both source and destination files have. When editing a file, detect also a format of a local temporary copy of the edited file as saved by the editor. See preferences for a location of the temporary copies.

Requesting Support

When the above does not help you understand the problem and you decide to seek support, include all your findings, including copies of both source and destination file. When editing a file, include also a local temporary copy as saved by the editor. Ideally compress (ZIP) the files to avoid your browser altering file format, when attaching the files to support request.