by Allen Wyatt
Bruce often has to save his Excel worksheets in CSV format for use with other programs. When performing the Save As operation, he noted that there are several different CSV formats listed as possibilities. Bruce is curious about the differences between these CSV formats.
For those unfamiliar with the acronym, CSV is short for “comma-separated values” and refers to a way that data can be saved in a non-Excel format. When you click the down-arrow next to the Save As Type drop-down list in the Save As dialog box, what you see depends on the version of Excel you are using. The version of Excel provided with Office 365 has the largest number of format options, including the largest number of CSV options. (See Figure 1.)
Figure 1. Excel lets you save workbook data in a plethora of formats.
You’ll note that you have four CSV-related formats available, as follows:
- CSV UTF-8 (Comma delimited)
- CSV (Comma delimited)
- CSV (Macintosh)
- CSV (MS-DOS)
There are different CSV formats available because there are different ways of creating CSV files. (Makes sense, huh?) Actually, there are many, many ways of creating CSV files, but Excel supports only these four.
Each format affects character encoding in slightly different ways. For example, the Macintosh format uses a CR (carriage return) as the terminating character for a record or a line, while Windows based formats—in essence, the other three—use CR/LF (carriage return/line feed). So, each format is slightly different.
The difference between the three formats is based on which code page is used with each format. Code pages have to do with the way in which individual characters are encoded, and it typically comes into play if you use extended characters—such as foreign characters or accented characters—in your data. The code pages used by each format can vary, depending on (1) the version of Excel you are using, (2) which language version of Excel you are using, and (3) how your regional settings are configured. In other words, there is no fast-and-hard rule about what code pages will be used with which CSV format you choose for your export.
Rather than get into the technical weeds about the differences about how the code pages are used, you might want to take a look at this web page which I found quite helpful. (Warning: The web page gets quite technical in places, and you’ll see programmer frustration with Excel on full display.)
The bottom line is that different formats are provided by Microsoft for different ways of communicating with other, non-Excel programs. If you want to communicate with a different program, you’ll need to have a firm understanding of what that other program expects in the way of CSV formatting, and then choose the format in Excel that best matches what is expected. You may also need to do some testing—making sure your workbook contains a wide variety of data, both regular and extended—to ensure your data export and import works as expected.
There is also one other tidbit that I’ve found helpful—don’t store your workbook ONLY in CSV format. Instead, save your “master copy” in Excel’s native format, and only use Save As to put it into your desired CSV format as you are preparing the file for the non-Excel program.