[buug] recode vs. iconv

Rick Moen rick at linuxmafia.com
Tue Dec 11 16:48:09 PST 2012


Quoting Ian Zimmerman (itz at buug.org):

> Today I needed to convert some text files from latin1 to plain ascii,
> preferably with transcriptions (e.g. ß into ss, ä into ae, and so on).
> They were large enough that doing it manually was out of the question,
> even in an excellent editor like emacs ;-)  I remembered that recode and
> iconv were the two programs potentially suitable for the task.
> 
> I tried recode first, but it was a disaster, I couldn't make it work
> despite reading the full fine manual (info version) in detail.  It would
> simply error out on any non-ascii character unless I gave the --force
> option (even when I called it not in-place but as a filter), and then it
> would succeed but silently drop some of them without substituting
> anything.

Huh!  {glyph of mild puzzlement}

Back when I was chief copyeditor for _Linux Gazette_ magazine, we had
one regular columnist who could not be cured of writing submissions in
Microsoft text editors and _not_ bothering to clean up his (alleged)
ASCII.  At the time, I did this to compensate for the man's inability to
write real plaintext, and it worked every time:

$ recode windows-1257..ASCII lg_bytes.html
$ tidy -cim lg_bytes.html
$ aspell check lg_bytes.html
Then, the inevitable manual line-editing work in vim.  

Anyway, the recode step always did The Right Thing, and never errored
out.


More information about the buug mailing list