The danger from computers is not that they will eventually get as smart as men, but we will meanwhile agree to meet them halfway -- Bernard Avishai

Web Engineering Unix FAQ: DOS and Unix Textfiles

DOS and Unix Textfiles

As mentioned in the Not Found FAQ page, one of the very subtle problems many students encounter is in moving textfiles created on a Windoze box to the Unix environment without doing a textfile translation -- such as forgetting to use "text" mode in FTP, or by simply copying it from a USB memory stick or other removable media. For many purposes this will work perfectly well, but as noted there are situations where it won't.

You can use the Unix file command line utility to discover what newline convention is used in a textfile. The following illustrates the ouput for a DOS (Windoze) textfile. Note that the $ indicates your command prompt:

$ file textfilename.txt
textfilename.txt: ASCII text, with CRLF line terminators
The following Perl "one-liner" will reveal if the file contains any \r (or ^M, which is the same thing) characters anywhere in a file. Try it.
perl -ne'print "bad: $_" if /\r/' < file.pl

Solutions

Using Nedit
The nedit text editor won't complain about DOS line-endings. It'll happily work with either format. You can simply convert a DOS textfile to Unix if you have it open in nedit, by selecting the Save as... option, and choose the Unix option for the file format. If you use the same file name, it overwrites the DOS format textfile with the Unix one.

Using gedit
the Gnome text editor gedit doesn't even seem to notice the difference between Unix and DOS textfiles, and there's no Save as... option or "Preferences" setting that will allow you to change a file's format. Use a better editor.

Classic vi versions
Versions of vi shipped with commercial Unix systems will (usually by default) reveal/display the ^M -- control-M, or carriage-return -- on the end of each line. If your version of vi shows this at the end of each line of a DOS textfile you can use the last-line substitution command: :%s/^V^M$//g, where ^V means control-V, etc. Explanation: this is a straightforweard regex subsititution, just like in Perl. The ^V "escapes" a subsequent command-character so that it can be interpreted literally (note: this also works at the Unix command prompt), and the $ matches the end-of-line.

"Modern" vi versions
Most commonly this is the vi-compatible editor vim, aliased to vi on your system. It will report on the bottom status line that you have a DOS textfile -- look for a string something like [dos]. To convert to Unix textfile format you should simply type the following lastline command: :se ff=unix, which will force the file to be subsequently saved in Unix textfile format.

dos2unix
It's possible that the dos2unix (and unix2dos) commands may be installed on your system, which makes the whole thing trivial.

At the command line
The standard Unix shell approach to converting a DOS textfile to Unix format is to use tr (the general-purpose "translator"), thus:
tr -d '\r' < inputfile > outputfile
Note the use of single quotes around the ''\r' -- necessary to prevent the shell interpreting the backslash. Double quotes won't work. If you look around, you may also find other versions of a tr command to do the same thing.
Copyright © 2003-2006 Phil Scott