關於CSV的各種編碼,還有BOM的雜記
各種CSV格式
Windows的Notepad++轉換編碼後輸出,用hexdump看內容結果。
欄位名稱加雙引號
d
的編碼: 64
00000000 22 64 61 74 61 49 44 22 2c 22 65 76 65 6e 74 49 |"dataID","eventI|
00000010 44 22 2c 22 65 76 65 6e 74 44 61 74 65 22 2c 22 |D","eventDate","|
00000020 73 65 61 73 6f 6e 22 2c 22 79 65 61 72 22 2c 22 |season","year","|
欄位名稱沒有雙引號,沒有BOM header
d
的編碼: 64
00000000 64 61 74 61 49 44 2c 65 76 65 6e 74 49 44 2c 65 |dataID,eventID,e|
00000010 76 65 6e 74 44 61 74 65 2c 73 65 61 73 6f 6e 2c |ventDate,season,|
00000020 79 65 61 72 2c 72 65 67 69 6f 6e 2c 6c 6f 63 61 |year,region,loca|
轉換成UTF-8 BOM
開頭: EF BB BF
d
的編碼: 64
(同ASCII)
00000000 ef bb bf 22 64 61 74 61 49 44 22 2c 22 65 76 65 |..."dataID","eve|
00000010 6e 74 49 44 22 2c 22 65 76 65 6e 74 44 61 74 65 |ntID","eventDate|
00000020 22 2c 22 73 65 61 73 6f 6e 22 2c 22 79 65 61 72 |","season","year|
轉換成UTF-16 LE BOM (Little Endian)
開頭: FF FE
(11111111 11111110)
d
的編碼: 64 00
00000000 ff fe 22 00 64 00 61 00 74 00 61 00 49 00 44 00 |..".d.a.t.a.I.D.|
00000010 22 00 2c 00 22 00 65 00 76 00 65 00 6e 00 74 00 |".,.".e.v.e.n.t.|
00000020 49 00 44 00 22 00 2c 00 22 00 65 00 76 00 65 00 |I.D.".,.".e.v.e.|
轉換成UTF-16 BE BOM (Big Endian)
開頭: FE FF
(11111110 11111111)
d
的編碼: 00 64
00000000 fe ff 00 22 00 64 00 61 00 74 00 61 00 49 00 44 |...".d.a.t.a.I.D|
00000010 00 22 00 2c 00 22 00 65 00 76 00 65 00 6e 00 74 |.".,.".e.v.e.n.t|
00000020 00 49 00 44 00 22 00 2c 00 22 00 65 00 76 00 65 |.I.D.".,.".e.v.e|
其他:
- 用LibreOffice、Google Spreadsheet匯出的預設欄位沒有雙引號,也沒加BOM header
(有空再來補各種版本Excel的匯出)