deusvulture replied to your post
“nightpool replied to your photo “As someone who keeps running into…”
fwiw I feel like I pretty frequently see those kinds of errors in rtf and doc parsing (a profusion of Âs is the classic example)
although maybe that’s a different thing
The  thing in particular is a symptom of reading text in one character encoding as if it’s in another character encoding, see here.
Character encoding problems are common, but they’re at a lower level than what I’m talking about – the level of “what sequence of characters do these bytes even represent” rather than “given this sequence of characters, which ones are special control sequences.” So they have the potential to come up in any kind of text processing, even in the hypothetical utopia where I have my ideal tabular data file format.
The direct analogue to the problems I have with CSVs would be something like interpreting an RTF as having lots of bold text because I typed “\b” somewhere in it, or conversely interpreting actual bold text in an RTF as regular text surrounded with “\b” and related sequences. Or the like with margin sizes, other layout information, etc.
