3 January 2010 Comments

Remove / Replace Carriage Returns, Line Breaks (CRLF) in a Text File


Welcome to Stone Studio!

How can I find/replace all Carriage Returns (Line Breaks, Form Feeds) from Documents? Of course, you can always go to each one and then hit the Delete key or the Backspace key and remove them one by one. But, the goal was to do it more efficiently.


Notepad++

I will definitely recommend Notepad++ for this job, and it is a must have text editor for general purpose anyway. So, the answer was:

Steps:
1) Using your mouse, highlight a formfeed/carriage return by starting at the end of one line and highlighting to the beginning of the next line.
2) Control-C to copy
3) Control-H to open the Replace dialog box
4) Click in the Find What box
5) Control-V to paste the formfeed/carriage return
6) don’t put anything in the Replace With textbox
7) Click on the Replace All button.
That gets them all.

Tips: If you look closer on the Replace dialog, you want to set the search mode to "Extended". Normal or Regular Expression modes wont work. Then just find "\r\n" (or just \n for unix files or just \r for mac format files), and set the replace to whatever you want.

However, that got me thinking about doing the same thing in other editors like Notepad, Wordpad, Microsoft Word and OpenOffice Writer.

Notepad

With Notepad, the plain text editor that comes as part of Windows, you’re out of luck — you can’t do this. The "Replace" dialog box does not work with non-printing characters. The only possible option is to take them out one by one.

Wordpad

With Wordpad, the simple word processor that is also included in Windows (as C:\Windows\wordpad.exe or C:\Windows\write.exe), you’re similarly also out of luck.

Microsoft Word and OpenOffice Writer

At least Microsoft Word and OpenOffice Writer are consistent here — the Replace dialog box doesn’t work for any of them.

Microsoft Word has a better solution than OpenOffice for this function. Word has an Advanced button on its Replace dialog box that looks like the following images. The image on the left is the dialog box and the top of the Special (characters) list. The image on the right is the full Special list.

replace_dialogbox_word_1

replace_dialogbox_word_2

This is one thing that, for some reason, OpenOffice Writer has addressed in a very strange way… Perhaps the best way to describe it is by quoting the help file, where I found the following procedure:

Removing Line Breaks Use the AutoFormat feature to remove line breaks that occur within sentences. Unwanted line breaks can occur when you copy text from another source and paste it into a text document.
This AutoFormat feature only works on text that is formatted with the "Default" paragraph style.
1.Choose ToolsAutoCorrect.
2.On the Options tab, ensure that Combine single line paragraphs if length greater than 50% is selected. To change the minimum percentage for the line length, double-click the option in the list, and then enter a new percentage.
3.Click OK.
4.Select the text containing the line breaks that you want to remove.
5.In the Apply Style box on the Formatting bar, choose Default.
6.Choose FormatAutoFormatApply.

It actually worked! I assumed that my default installation of OpenOffice 2.2 was set up properly (for this procedure), so I started at the 4th step. It worked just fine. But, this was a very unusual, non-intuitive process.

Tags: ,
blog comments powered by Disqus