CSV文件的Blob字符集

时间:2019-05-11 16:34:36

标签: javascript character-encoding blob

我想使用Blob创建CSV文件。 该文件应使用ANSI编码,但无效。

var blob = new Blob(["\ufeff", csvFile], { type: 'text/csv;charset=windows-1252;' });

该文件始终使用UTF-8编码创建。

1 个答案:

答案 0 :(得分:0)

是的,将USVString传递给Blob的构造函数将自动将其转换为 UTF-8

type参数仅由资源获取程序用作与传递的数据相对应的Content-Type。 charset标头仅提供有关内容编码的一个线索,但是,它根本不会更改内容。

将使用此charset信息的一种方法是FileReader.readAsText()(如果未提供encoding参数)。

const data = "é";
const blob1 = new Blob([data], {type: 'plain/text;charset=windows-1252'});
const blob2 = new Blob([data], {type: 'plain/text;'});

readAsText(blob1).then(res => console.log('read as CP-1252:', res));
readAsText(blob2).then(res => console.log('read as UTF-8:', res));

readAsArrayBuffer(blob1).then(res => console.log("actual bytes content CP-1252:", ...new Uint8Array(res)));
readAsArrayBuffer(blob2).then(res => console.log("actual bytes content UTF-8:", ...new Uint8Array(res)));


function readAsText(blob) {
  return new Promise(res => {
    const reader = new FileReader();
    reader.onload = e => res(reader.result);
    reader.readAsText(blob);
  });
}
function readAsArrayBuffer(blob) {
  return new Promise(res => {
    const reader = new FileReader();
    reader.onload = e => res(reader.result);
    reader.readAsArrayBuffer(blob);
  });
}
.as-console-wrapper {
  max-height: none!important;
}

如您所见,两个Blob实际上都持有完全相同的字节数据:UTF-8 é字符之一。


因此,根据需要,您需要从TypedArray生成Blob,该TypedArray包含已编码为CP-1252的数据。 曾经有一个使用TextEncoder API的选项,可以将USVStrings编码为任意编码,但这已从规范和浏览器中删除。

您需要使用一个库才能执行转换。在这里,我将使用this one

const data = new TextEncoder('windows-1252', {
  NONSTANDARD_allowLegacyEncoding: true
}).encode("é"); // now `data` is an Uint8Array
const blob1 = new Blob([data], {type: 'plain/text;charset=windows-1252'});
const blob2 = new Blob([data], {type: 'plain/text;'});

readAsText(blob1).then(res => console.log('read as CP-1252:', res));
readAsText(blob2).then(res => console.log('read as UTF-8:', res));

readAsArrayBuffer(blob1).then(res => console.log("actual bytes content CP-1252:", ...new Uint8Array(res)));
readAsArrayBuffer(blob2).then(res => console.log("actual bytes content UTF-8:", ...new Uint8Array(res)));


function readAsText(blob) {
  return new Promise(res => {
    const reader = new FileReader();
    reader.onload = e => res(reader.result);
    reader.readAsText(blob);
  });
}
function readAsArrayBuffer(blob) {
  return new Promise(res => {
    const reader = new FileReader();
    reader.onload = e => res(reader.result);
    reader.readAsArrayBuffer(blob);
  });
}
.as-console-wrapper {
  max-height: none!important;
}
<script>
  // force installation of the polyfill...
  window.TextEncoder = null;
</script>
<script src="https://cdn.jsdelivr.net/gh/inexorabletash/text-encoding/lib/encoding-indexes.js"></script>
<script src="https://cdn.jsdelivr.net/gh/inexorabletash/text-encoding/lib/encoding.js"></script>