The regex can probably be done much better, but anyway what is surprising is that, among others, the above code outputs the following line: 12345678: 12345678912345 gobbledegook �IDNR: 69 12345.67. In my working with one of these files, I've run the following command in the process of trying to figure out a way to correctly identify which instances of is a delimiter and replace them with some other character: grep -v -n -text '-.*-.*-.*-' < Transactions.csv because I have to deal with some very large CSV files (>10GB) which aren't quite delimited correctly (for instance, having occurrences of the delimiter character inside some of the fields. I've recently gotten into using tools like grep, wc, cat, etc.