将JavaScript字符串传递给编译为WebAssembly的Rust函数

时间:2018-02-27 17:29:07

标签: javascript string rust webassembly

我有这个简单的Rust函数:

#[no_mangle]
pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
    match operator {
        "SUM" => n1 + n2,
        "DIFF" => n1 - n2,
        "MULT" => n1 * n2,
        "DIV" => n1 / n2,
        _ => 0
    }
}

我正在成功地将此编译为WebAssembly,但是没有设法将operator参数从JS传递给Rust。

调用Rust函数的JS行如下所示:

instance.exports.compute(operator, n1, n2);

operator是JS Stringn1n2是JS Number

n1n2正确传递,可以在编译函数内部读取,所以我猜问题是我如何传递字符串。我想它是作为从JS到WebAssembly的指针传递的,但无法找到有关其工作原理的证据或材料。

我没有使用Emscripten,并希望将其保持独立(编译目标wasm32-unknown-unknown),但我看到他们将编译后的函数包装在Module.cwrap中,也许这可能会有所帮助?

3 个答案:

答案 0 :(得分:10)

最简单,最惯用的解决方案

大多数人应该使用wasm-bindgen,这使整个过程更多更简单!

低级手动实施

要在JavaScript和Rust之间传输字符串数据,您需要决定

  1. 文本的编码:UTF-8(Rust native)或UTF-16(JS native)。
  2. 谁将拥有内存缓冲区:JS(调用者)或Rust(被调用者)。
  3. 如何表示字符串数据和长度:NUL终止(C风格)或不同长度(Rust风格)。
  4. 如果数据和长度分开,如何传达数据和长度。
  5. 常用设置

    为WASM构建C dylib以帮助它们缩小尺寸非常重要。

    <强> Cargo.toml

    [package]
    name = "quick-maths"
    version = "0.1.0"
    authors = ["An Devloper <an.devloper@example.com>"]
    
    [lib]
    crate-type = ["cdylib"]
    

    <强> .cargo /配置

    [target.wasm32-unknown-unknown]
    rustflags = [
        "-C", "link-args=--import-memory",
    ]
    

    <强>的package.json

    {
      "name": "quick-maths",
      "version": "0.1.0",
      "main": "index.js",
      "author": "An Devloper <an.devloper@example.com>",
      "license": "MIT",
      "scripts": {
        "example": "node ./index.js"
      },
      "dependencies": {
        "fs-extra": "^8.0.1",
        "text-encoding": "^0.7.0"
      }
    }
    

    我使用的是NodeJS 12.1.0。

    <强>执行

    $ rustup component add rust-std --target wasm32-unknown-unknown
    $ cargo build --release --target wasm32-unknown-unknown
    

    解决方案1 ​​

    我决定:

    1. 要将JS字符串转换为UTF-8,这意味着TextEncoder JS API最适合。
    2. 调用者应该拥有内存缓冲区。
    3. 将长度设为单独的值。
    4. 应该使用另一个结构和分配来保存指针和长度。
    5. <强> LIB / src.rs

      // A struct with a known memory layout that we can pass string information in
      #[repr(C)]
      pub struct JsInteropString {
          data: *const u8,
          len: usize,
      }
      
      // Our FFI shim function    
      #[no_mangle]
      pub unsafe extern "C" fn compute(s: *const JsInteropString, n1: i32, n2: i32) -> i32 {
          // Check for NULL (see corresponding comment in JS)
          let s = match s.as_ref() {
              Some(s) => s,
              None => return -1,
          };
      
          // Convert the pointer and length to a `&[u8]`.
          let data = std::slice::from_raw_parts(s.data, s.len);
      
          // Convert the `&[u8]` to a `&str`    
          match std::str::from_utf8(data) {
              Ok(s) => real_code::compute(s, n1, n2),
              Err(_) => -2,
          }
      }
      
      // I advocate that you keep your interesting code in a different
      // crate for easy development and testing. Have a separate crate
      // with the FFI shims.
      mod real_code {
          pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
              match operator {
                  "SUM"  => n1 + n2,
                  "DIFF" => n1 - n2,
                  "MULT" => n1 * n2,
                  "DIV"  => n1 / n2,
                  _ => 0,
              }
          }
      }
      

      <强> index.js

      const fs = require('fs-extra');
      const { TextEncoder } = require('text-encoding');
      
      // Allocate some memory.
      const memory = new WebAssembly.Memory({ initial: 20, maximum: 100 });
      
      // Connect these memory regions to the imported module
      const importObject = {
        env: { memory }
      };
      
      // Create an object that handles converting our strings for us
      const memoryManager = (memory) => {
        var base = 0;
      
        // NULL is conventionally at address 0, so we "use up" the first 4
        // bytes of address space to make our lives a bit simpler.
        base += 4;
      
        return {
          encodeString: (jsString) => {
            // Convert the JS String to UTF-8 data
            const encoder = new TextEncoder();
            const encodedString = encoder.encode(jsString);
      
            // Organize memory with space for the JsInteropString at the
            // beginning, followed by the UTF-8 string bytes.
            const asU32 = new Uint32Array(memory.buffer, base, 2);
            const asBytes = new Uint8Array(memory.buffer, asU32.byteOffset + asU32.byteLength, encodedString.length);
      
            // Copy the UTF-8 into the WASM memory.
            asBytes.set(encodedString);
      
            // Assign the data pointer and length values.
            asU32[0] = asBytes.byteOffset;
            asU32[1] = asBytes.length;
      
            // Update our memory allocator base address for the next call
            const originalBase = base;
            base += asBytes.byteOffset + asBytes.byteLength;
      
            return originalBase;
          }
        };
      };
      
      const myMemory = memoryManager(memory);
      
      fs.readFile('./target/wasm32-unknown-unknown/release/quick_maths.wasm')
        .then(bytes => WebAssembly.instantiate(bytes, importObject))
        .then(({ instance }) => {
          const argString = "MULT";
          const argN1 = 42;
          const argN2 = 100;
      
          const s = myMemory.encodeString(argString);
          const result = instance.exports.compute(s, argN1, argN2);
      
          console.log(result);
        });
      

      <强>执行

      $ yarn run example
      4200
      

      解决方案2

      我决定:

      1. 要将JS字符串转换为UTF-8,这意味着TextEncoder JS API最适合。
      2. 模块应该拥有内存缓冲区。
      3. 将长度设为单独的值。
      4. 使用Box<String>作为基础数据结构。这允许Rust代码进一步使用分配。
      5. <强>的src / lib.rs

        // Very important to use `transparent` to prevent ABI issues
        #[repr(transparent)]
        pub struct JsInteropString(*mut String);
        
        impl JsInteropString {
            // Unsafe because we create a string and say it's full of valid
            // UTF-8 data, but it isn't!
            unsafe fn with_capacity(cap: usize) -> Self {
                let mut d = Vec::with_capacity(cap);
                d.set_len(cap);
                let s = Box::new(String::from_utf8_unchecked(d));
                JsInteropString(Box::into_raw(s))
            }
        
            unsafe fn as_string(&self) -> &String {
                &*self.0
            }
        
            unsafe fn as_mut_string(&mut self) -> &mut String {
                &mut *self.0
            }
        
            unsafe fn into_boxed_string(self) -> Box<String> {
                Box::from_raw(self.0)
            }
        
            unsafe fn as_mut_ptr(&mut self) -> *mut u8 {
                self.as_mut_string().as_mut_vec().as_mut_ptr()
            }
        }
        
        #[no_mangle]
        pub unsafe extern "C" fn stringPrepare(cap: usize) -> JsInteropString {
            JsInteropString::with_capacity(cap)
        }
        
        #[no_mangle]
        pub unsafe extern "C" fn stringData(mut s: JsInteropString) -> *mut u8 {
            s.as_mut_ptr()
        }
        
        #[no_mangle]
        pub unsafe extern "C" fn stringLen(s: JsInteropString) -> usize {
            s.as_string().len()
        }
        
        #[no_mangle]
        pub unsafe extern "C" fn compute(s: JsInteropString, n1: i32, n2: i32) -> i32 {
            let s = s.into_boxed_string();
            real_code::compute(&s, n1, n2)
        }
        
        mod real_code {
            pub fn compute(operator: &str, n1: i32, n2: i32) -> i32 {
                match operator {
                    "SUM"  => n1 + n2,
                    "DIFF" => n1 - n2,
                    "MULT" => n1 * n2,
                    "DIV"  => n1 / n2,
                    _ => 0,
                }
            }
        }
        

        <强> index.js

        const fs = require('fs-extra');
        const { TextEncoder } = require('text-encoding');
        
        class QuickMaths {
          constructor(instance) {
            this.instance = instance;
          }
        
          difference(n1, n2) {
            const { compute } = this.instance.exports;
            const op = this.copyJsStringToRust("DIFF");
            return compute(op, n1, n2);
          }
        
          copyJsStringToRust(jsString) {
            const { memory, stringPrepare, stringData, stringLen } = this.instance.exports;
        
            const encoder = new TextEncoder();
            const encodedString = encoder.encode(jsString);
        
            // Ask Rust code to allocate a string inside of the module's memory
            const rustString = stringPrepare(encodedString.length);
        
            // Get a JS view of the string data
            const rustStringData = stringData(rustString);
            const asBytes = new Uint8Array(memory.buffer, rustStringData, encodedString.length);
        
            // Copy the UTF-8 into the WASM memory.
            asBytes.set(encodedString);
        
            return rustString;
          }
        }
        
        async function main() {
          const bytes = await fs.readFile('./target/wasm32-unknown-unknown/release/quick_maths.wasm');
          const { instance } = await WebAssembly.instantiate(bytes);
          const maffs = new QuickMaths(instance);
        
          console.log(maffs.difference(100, 201));
        }
        
        main();
        

        <强>执行

        $ yarn run example
        -101
        

        请注意,此过程可用于其他类型。你&#34;只是&#34;必须决定如何将数据表示为双方同意然后发送它的一组字节。

        另见:

答案 1 :(得分:1)

WebAssembly程序拥有自己的内存空间。此空间通常由WebAssembly程序本身管理,借助于分配器库,例如wee_alloc

JavaScript可以查看和修改该内存空间,但无法知道分配器库结构的组织方式。因此,如果我们只是从JavaScript写入WASM内存,那么我们可能会覆盖重要的东西并搞砸了。因此,WebAssembly程序本身必须首先分配内存区域,将其传递给JavaScript,然后JavaScript可以用数据填充该区域。

在下面的例子中我们这样做:在WASM内存空间中分配一个缓冲区,在那里复制UTF-8字节,将缓冲区位置传递给Rust函数,然后释放缓冲区。

锈:

#![feature(allocator_api)]

use std::heap::{Alloc, Heap, Layout};

#[no_mangle]
pub fn alloc(len: i32) -> *mut u8 {
    let mut heap = Heap;
    let layout = Layout::from_size_align(len as usize, 1).expect("!from_size_align");
    unsafe { heap.alloc(layout).expect("!alloc") }
}

#[no_mangle]
pub fn dealloc(ptr: *mut u8, len: i32) {
    let mut heap = Heap;
    let layout = Layout::from_size_align(len as usize, 1).expect("!from_size_align");
    unsafe { heap.dealloc(ptr, layout) }
}

#[no_mangle]
pub fn is_foobar(buf: *const u8, len: i32) -> i32 {
    let js = unsafe { std::slice::from_raw_parts(buf, len as usize) };
    let js = unsafe { std::str::from_utf8_unchecked(js) };
    if js == "foobar" {
        1
    } else {
        0
    }
}

打字稿:

// cf. https://github.com/Microsoft/TypeScript/issues/18099
declare class TextEncoder {constructor (label?: string); encode (input?: string): Uint8Array}
declare class TextDecoder {constructor (utfLabel?: string); decode (input?: ArrayBufferView): string}
// https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/webassembly-js-api/index.d.ts
declare namespace WebAssembly {
  class Instance {readonly exports: any}
  interface ResultObject {instance: Instance}
  function instantiateStreaming (file: Promise<Response>, options?: any): Promise<ResultObject>}

var main: {
  memory: {readonly buffer: ArrayBuffer}
  alloc (size: number): number
  dealloc (ptr: number, len: number): void
  is_foobar (buf: number, len: number): number}

function withRustString (str: string, cb: (ptr: number, len: number) => any): any {
  // Convert the JavaScript string to an array of UTF-8 bytes.
  const utf8 = (new TextEncoder()).encode (str)
  // Reserve a WASM memory buffer for the UTF-8 array.
  const rsBuf = main.alloc (utf8.length)
  // Copy the UTF-8 array into the WASM memory.
  new Uint8Array (main.memory.buffer, rsBuf, utf8.length) .set (utf8)
  // Pass the WASM memory location and size into the callback.
  const ret = cb (rsBuf, utf8.length)
  // Free the WASM memory buffer.
  main.dealloc (rsBuf, utf8.length)
  return ret}

WebAssembly.instantiateStreaming (fetch ('main.wasm')) .then (results => {
  main = results.instance.exports
  // Prints "foobar is_foobar? 1".
  console.log ('foobar is_foobar? ' +
    withRustString ("foobar", function (buf, len) {return main.is_foobar (buf, len)}))
  // Prints "woot is_foobar? 0".
  console.log ('woot is_foobar? ' +
    withRustString ("woot", function (buf, len) {return main.is_foobar (buf, len)}))})

P.S。 The Module._malloc in Emscripten可能在语义上等同于我们上面实现的alloc函数。在“wasm32-unknown-emscripten”目标you can use the Module._malloc with Rust下。

答案 2 :(得分:-2)

正如Shepmaster所指出的,只有数字可以传递给WebAssembly,因此我们需要将字符串转换为Uint16Array

为此,我们可以使用str2ab找到的function str2ab(str) { var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char var bufView = new Uint16Array(buf); for (var i=0, strLen=str.length; i < strLen; i++) { bufView[i] = str.charCodeAt(i); } return buf; } 函数:

instance.exports.compute(
    str2ab(operator), 
    n1, n2
);

这现在有效:

pd.rolling()

因为我们将引用传递给无符号整数数组。