javascript - Converting between strings and ArrayBuffers

ID : 8438

viewed : 230

Tags : javascriptserializationarraybuffertyped-arraysjavascript





Top 5 Answer for javascript - Converting between strings and ArrayBuffers

vote vote

92

Update 2016 - five years on there are now new methods in the specs (see support below) to convert between strings and typed arrays using proper encoding.

TextEncoder

The TextEncoder represents:

The TextEncoder interface represents an encoder for a specific method, that is a specific character encoding, like utf-8, iso-8859-2, koi8, cp1261, gbk, ... An encoder takes a stream of code points as input and emits a stream of bytes.

Change note since the above was written: (ibid.)

Note: Firefox, Chrome and Opera used to have support for encoding types other than utf-8 (such as utf-16, iso-8859-2, koi8, cp1261, and gbk). As of Firefox 48 [...], Chrome 54 [...] and Opera 41, no other encoding types are available other than utf-8, in order to match the spec.*

*) Updated specs (W3) and here (whatwg).

After creating an instance of the TextEncoder it will take a string and encode it using a given encoding parameter:

if (!("TextEncoder" in window))     alert("Sorry, this browser does not support TextEncoder...");    var enc = new TextEncoder(); // always utf-8  console.log(enc.encode("This is a string converted to a Uint8Array"));

You then of course use the .buffer parameter on the resulting Uint8Array to convert the underlaying ArrayBuffer to a different view if needed.

Just make sure that the characters in the string adhere to the encoding schema, for example, if you use characters outside the UTF-8 range in the example they will be encoded to two bytes instead of one.

For general use you would use UTF-16 encoding for things like localStorage.

TextDecoder

Likewise, the opposite process uses the TextDecoder:

The TextDecoder interface represents a decoder for a specific method, that is a specific character encoding, like utf-8, iso-8859-2, koi8, cp1261, gbk, ... A decoder takes a stream of bytes as input and emits a stream of code points.

All available decoding types can be found here.

if (!("TextDecoder" in window))    alert("Sorry, this browser does not support TextDecoder...");    var enc = new TextDecoder("utf-8");  var arr = new Uint8Array([84,104,105,115,32,105,115,32,97,32,85,105,110,116,                            56,65,114,114,97,121,32,99,111,110,118,101,114,116,                            101,100,32,116,111,32,97,32,115,116,114,105,110,103]);  console.log(enc.decode(arr));

The MDN StringView library

An alternative to these is to use the StringView library (licensed as lgpl-3.0) which goal is:

  • to create a C-like interface for strings (i.e., an array of character codes — an ArrayBufferView in JavaScript) based upon the JavaScript ArrayBuffer interface
  • to create a highly extensible library that anyone can extend by adding methods to the object StringView.prototype
  • to create a collection of methods for such string-like objects (since now: stringViews) which work strictly on arrays of numbers rather than on creating new immutable JavaScript strings
  • to work with Unicode encodings other than JavaScript's default UTF-16 DOMStrings

giving much more flexibility. However, it would require us to link to or embed this library while TextEncoder/TextDecoder is being built-in in modern browsers.

Support

As of July/2018:

TextEncoder (Experimental, On Standard Track)

 Chrome    | Edge      | Firefox   | IE        | Opera     | Safari  ----------|-----------|-----------|-----------|-----------|-----------      38    |     ?     |    19°    |     -     |     25    |     -   Chrome/A  | Edge/mob  | Firefox/A | Opera/A   |Safari/iOS | Webview/A  ----------|-----------|-----------|-----------|-----------|-----------      38    |     ?     |    19°    |     ?     |     -     |     38  °) 18: Firefox 18 implemented an earlier and slightly different version of the specification.  WEB WORKER SUPPORT:  Experimental, On Standard Track   Chrome    | Edge      | Firefox   | IE        | Opera     | Safari  ----------|-----------|-----------|-----------|-----------|-----------      38    |     ?     |     20    |     -     |     25    |     -   Chrome/A  | Edge/mob  | Firefox/A | Opera/A   |Safari/iOS | Webview/A  ----------|-----------|-----------|-----------|-----------|-----------      38    |     ?     |     20    |     ?     |     -     |     38  Data from MDN - `npm i -g mdncomp` by epistemex 
vote vote

87

Although Dennis and gengkev solutions of using Blob/FileReader work, I wouldn't suggest taking that approach. It is an async approach to a simple problem, and it is much slower than a direct solution. I've made a post in html5rocks with a simpler and (much faster) solution: http://updates.html5rocks.com/2012/06/How-to-convert-ArrayBuffer-to-and-from-String

And the solution is:

function ab2str(buf) {   return String.fromCharCode.apply(null, new Uint16Array(buf)); }  function str2ab(str) {   var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char   var bufView = new Uint16Array(buf);   for (var i=0, strLen=str.length; i<strLen; i++) {     bufView[i] = str.charCodeAt(i);   }   return buf; } 

EDIT:

The Encoding API helps solving the string conversion problem. Check out the response from Jeff Posnik on Html5Rocks.com to the above original article.

Excerpt:

The Encoding API makes it simple to translate between raw bytes and native JavaScript strings, regardless of which of the many standard encodings you need to work with.

<pre id="results"></pre>  <script>   if ('TextDecoder' in window) {     // The local files to be fetched, mapped to the encoding that they're using.     var filesToEncoding = {       'utf8.bin': 'utf-8',       'utf16le.bin': 'utf-16le',       'macintosh.bin': 'macintosh'     };      Object.keys(filesToEncoding).forEach(function(file) {       fetchAndDecode(file, filesToEncoding[file]);     });   } else {     document.querySelector('#results').textContent = 'Your browser does not support the Encoding API.'   }    // Use XHR to fetch `file` and interpret its contents as being encoded with `encoding`.   function fetchAndDecode(file, encoding) {     var xhr = new XMLHttpRequest();     xhr.open('GET', file);     // Using 'arraybuffer' as the responseType ensures that the raw data is returned,     // rather than letting XMLHttpRequest decode the data first.     xhr.responseType = 'arraybuffer';     xhr.onload = function() {       if (this.status == 200) {         // The decode() method takes a DataView as a parameter, which is a wrapper on top of the ArrayBuffer.         var dataView = new DataView(this.response);         // The TextDecoder interface is documented at http://encoding.spec.whatwg.org/#interface-textdecoder         var decoder = new TextDecoder(encoding);         var decodedString = decoder.decode(dataView);         // Add the decoded file's text to the <pre> element on the page.         document.querySelector('#results').textContent += decodedString + '\n';       } else {         console.error('Error while requesting', file, this);       }     };     xhr.send();   } </script> 
vote vote

72

You can use TextEncoder and TextDecoder from the Encoding standard, which is polyfilled by the stringencoding library, to convert string to and from ArrayBuffers:

var uint8array = new TextEncoder().encode(string); var string = new TextDecoder(encoding).decode(uint8array); 
vote vote

70

Blob is much slower than String.fromCharCode(null,array);

but that fails if the array buffer gets too big. The best solution I have found is to use String.fromCharCode(null,array); and split it up into operations that won't blow the stack, but are faster than a single char at a time.

The best solution for large array buffer is:

function arrayBufferToString(buffer){      var bufView = new Uint16Array(buffer);     var length = bufView.length;     var result = '';     var addition = Math.pow(2,16)-1;      for(var i = 0;i<length;i+=addition){          if(i + addition > length){             addition = length - i;         }         result += String.fromCharCode.apply(null, bufView.subarray(i,i+addition));     }      return result;  } 

I found this to be about 20 times faster than using blob. It also works for large strings of over 100mb.

vote vote

58

Based on the answer of gengkev, I created functions for both ways, because BlobBuilder can handle String and ArrayBuffer:

function string2ArrayBuffer(string, callback) {     var bb = new BlobBuilder();     bb.append(string);     var f = new FileReader();     f.onload = function(e) {         callback(e.target.result);     }     f.readAsArrayBuffer(bb.getBlob()); } 

and

function arrayBuffer2String(buf, callback) {     var bb = new BlobBuilder();     bb.append(buf);     var f = new FileReader();     f.onload = function(e) {         callback(e.target.result)     }     f.readAsText(bb.getBlob()); } 

A simple test:

string2ArrayBuffer("abc",     function (buf) {         var uInt8 = new Uint8Array(buf);         console.log(uInt8); // Returns `Uint8Array { 0=97, 1=98, 2=99}`          arrayBuffer2String(buf,              function (string) {                 console.log(string); // returns "abc"             }         )     } ) 

Top 3 video Explaining javascript - Converting between strings and ArrayBuffers







Related QUESTION?