Rust Trainings All in One

High-level Intro About Rust

Why Rust?

Let's talk about values and tradeoffs first

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

C

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

Erlang/Elixir

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

Python

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

Java (in early days)

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety (memory)
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

Rust

  • Approachability
  • Availability
  • Compatibility
  • Composability
  • Debuggability
  • Expressiveness
  • Extensibility
  • Interoerability
  • Integrity
  • Maintainabiity
  • Measurability
  • Operability
  • Performance
  • Portability
  • Productivity
  • Resilliency
  • Rigor
  • Safety!!!
  • Security
  • Simplicity
  • Stability
  • Thoroughness
  • Transparent
  • Velocity

Why safety is important?

(Source: Memory Safety Issues Are Still the Leading Source of Security Vulnerabilities)

Safety is hard!

  • memory safety is not easy (you need to understand the corner cases)
  • conccurency safety is really hard (without certain tradeoffs)
  • Often you have to bear the extra layer of abstractions
    • normally it means performance hit

Memory safety

  • Manually - C/C++: painful and error-prone
  • Smart Pointers - C++/ObjC/Swift: be aware of cyclical references
  • GC - Java/DotNet/Erlang: mubch bigger memory consumption, and STW
  • Ownership - Rust: learning curve

Concurrency safety

  • single-threaded - Javascript: cannot leverage multicore
  • GIL - Python/Ruby: multithreading is notorious inefficient
  • Actor model - Erlang/Akka: at the cost of memory copy and heap allocation
  • CSP - Golang: at the cost of memory copy and heap allocation
  • Ownership + Type System - Rust: super elegant and no extra cost!

How Rust achieves

memory and conccurency safety

without extra cost?

Show me the code!

Recap

  • One and only one owner
  • Multiple immutable references
  • mutable reference is mutual exclsive
  • Reference cannot outlive owner
  • use type safety for thread safety
With these simple rules, Rust achieved safety with
zero cost abstraction

A glance at Rust Type System

How's Productivity of Rust?

Things built with Rust

Should I use Rust?

  • Rust is ideal when you need a system that reliable and performant
  • Sometimes you don't, sometimes you do, sometimes you need that later
  • it's all about tradeoffs

Rust for our use cases

  • parts of the system that are bottlenecks
    • bottleneck on computation
    • bottleneck on memory consumption
    • bootleneck on I/O
  • parser/decoder/encoder
  • wants to leverage existingC/C++/Rust ecosystem (e.g. you need blake3 for hashing)

Rust FFI

Learning rust as a(n)...

  • Elixir eng: ownership model, type system, oh no mutation
  • Scala eng: ownership model, oh no mutation
  • Typescript eng: ownership model, multi-threaded programming
  • Swift/Java eng: ownership model
  • Python eng: ownership model, type system

The common misunderstandings

1. Rust is super hard to learn...

Rust is explicit

  • Lots of knowledge about computer science is suddenly explicit to you
  • If all your pain to learn a lang is 100%:
    • Rust:
      • Compiler help to reduce that to 90%
      • Then you suffer 70% the pains in first 3-6 months
      • Then the rest 20% in 3-5 years
    • Other:
      • You suffer 10-30% in first 3-6 months
      • Then 70%-90% in next 3-5 years

2. Unsafe Rust is evil...

References

Ownership, borrow check, and lifetime

Ownership/Borrow Rules Review

Lifetime, not a new idea

Lifetime: Stack memory

#include <stdio.h>
static int VALUE = 42;

void world(char *st, int num) {
    printf("%s(%d)\n", st, num);
}


void hello(int num) {
    char *st = "hello world";
    int v = VALUE+num;
    world(st, v);
}

int main() {
    hello(2);
}

Lifetime: Heap memory (tracing GC)

Lifetime: Heap memory (ARC)

How Rust handles lifetime?

Move semantics

What a bout Borrow?

Think about: why does this work in Rust, but not C/C++?

Rust lifetime checker prevents this...

Benefit of lifetime-constrained borrow

  • can borrow anything (stack object, heap object)
  • safety can be guaranteed at compile-time (no runtime lifetime bookkeeping)
  • Rust borrow checker is mostly a lifetime checker

Lifetime Annotation

  • similar as generics, but in lowercase starting with '
  • only need to put annotation when there's conflicts
// need explicit lifetime
struct User<'a> {
    name: &'a str,
    ...
}
fn process<T, 'a, 'b>(item1: &'a T, item2: &'b T) {}

// &'a User could be written as &User since on confliction
fn lifetime_example(user: &User) { // --+ Lifetime 'a
    if user.is_authed() {          //   |--+ Lifetime 'b
        let role = user.roles();   //   |  |
                                   //   |  |--+ Lifetime 'c
        verify(&role);             //   |  |  |
                                   //   |  |--+
    }                              //   |--+
}                                  // --+

fn verify(x: &Role) { /*...*/ }

Static Lifetime

  • 'static
  • data included in bss / data / text section
    • constants / static variables
    • string literals
    • functions
  • if used as trait bound:
    • the type does not contain any non-static references
    • owned data always passes a 'static lifetime bound, but reference to the owned data does not

Thread spawn

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static,
    T: Send + 'static,
{
    Builder::new().spawn(f).expect("failed to spawn thread")
}

The 'static constraint is a way of saying, roughly, that no borrowed data is permitted in the closure.

RAII (Resource Acquisition Is Initialization)

  • initializing the object will also make sure resource is initialized
  • releasing the object will also make sure resource is released

Drop Trait

  • memory
  • file
  • socket
  • lock
  • any other OS resources

demo

Mental model

  • write the code and defer the complexity about ensuring the code is safe/correct
  • comfront the most of the safety/correctness problems upfront
  • Mutate can only happen when you own the data, or you have a mutable reference
    • either way, the thread is guaranteed to be the only one with access at a time
  • Fearless Refactoring
  • reinforce properties well-behaved software exhibits
  • sometimes too strict: rust isn't dogmatic about enforcing it

References

Cost of defects

  • Don't introduce defect (this is impossible because humans are fallible).
  • Detect and correct defect as soon as the bad key press occurs (within reason: you don't want the programmer to lose too much flow) (milliseconds later).
  • At next build / test time (seconds or minutes later).
  • When code is shared with others (maybe you push a branch and CI tells you something is wrong) (minutes to days later).
  • During code review (minutes to days later).
  • When code is integrated (e.g. merged) (minutes to days later).
  • When code is deployed (minutes to days or even months later).
  • When a bug is reported long after the code has been deployed (weeks to years later).

Ownership and Borrow rules

  • Use after free: no more (reference can't point to dropped value)
  • Buffer underruns, overflows, illegal memory access: no more (reference must be valid and point to owned value)
  • memory level data races: no more (single writer or multiple readers)

Typesystem and Generic Programming

How types are layed out in memory?

Trait (Typeclass)

Trait Object

  • Unlike java, you can't assign a value to a trait (no implicit reference!!!)
  • trait object is a fat pointer (automatically converted)
    • normal pointer reference to the value
    • vtable (vtable pointer)
      • unlike C++/Java, it is not a ptr in struct
  • dynamic dispatch

More about trait

  • associated type
  • generics
  • supertrait
  • trait composition

Generics

History of Generic Programming

(from: Fundamentals of Generic Programming)

Classification of Abstractions

  • Data Abstractions: data types and sets of operations defined on them
    • e.g. Vec<T>, HashMap<K, V>
  • Algorithmic abstrations: families of data abstractions that have a set of efficient algorithms in common (generic algorithms)
    • e.g. quicksort, binary_search
  • Structural abstractions: a data abstraction A belongs to a structural abstraction S if and only if S is an intersection of some of the algorithmic abstractions to which A belongs.
    • e.g. singly-linked-lists
  • Representational abstractions: mappings from one structural abstration to another, creating a new type and implementing a set of operations on that type.
    • e.g. VecDeque<T>
  • Comes from: Generic Programming: http://stepanovpapers.com/genprog.pdf

Generics to Types

just like

Types to Values

Generic Programming Example

demo: rust implmentation

Realworld GP example

Functions delay binding: data structures induce binding.
Moral: Structure data late in the programming process.

— Epigrams on programming

References

Why not object oriented?

Class is awesome

  • Encapsulation
  • Access control
  • Abstraction
  • Namespace
  • Extensibility

Class has problems

  • inheritance is pretty limited - choose superclass well!
  • know what/how to override (and when not to)
  • superclass may have properties
    • you have to accept it
    • initialization burden
    • don't break assumptions of superclass
  • hard to reuse outside the hierachy (composition over inheritance)

Concurrency - primitives

Let's solve a real-world problem

v1: Simple loop

v2: Multithread with shared state

v3: Optimize the lock

v4: Share memory by communicating

v5: Async Task

What have we used so far?

  • Mutex Lock
  • Channel/Actor
  • Async Task (coroutine, promise, etc.)

How is Mutex implemented?

An naive implementation

struct Lock<T> {
  locked: bool,
  data: T,
}

impl<T> Lock<T> {
  pub fn new(data: T) -> Lock<T> { ... }

  pub fn lock<R>(&mut self, op: impl FnOnce(&mut T) -> R) -> R {
    // spin if we can't get lock
    while self.locked != false {} // **1
    // ha, we can lock and do our job
    self.locked = true; // **2
    // execute the op as we got lock
    let ret = op(self.data); // **3
    // unlock
    self.locked = false; // **4
    ret
  }
}

// You may call it like this:
let l = Lock::new(0);
l.lock(|v| v += 1);

Issues

  • Atomicity
    • In multicore environment, race condition between 1 and 2 - other thread may kick in
    • Even in single core environment, OS may do preempted multitasking, causing other thread kick in
  • OOO execution
    • Compiler might generate optimized instructions that put 3 before 1
    • CPU may do OOO execution to best utilize pipeline, so putting 3 before 1 might also happen

How to solve this?

  • We need CPU instruction to guarantee Atomicity and non-OOO
  • Algorithm: CAS (Compare-And-Swap)
  • data structure: AtomicXXX

Atomics

Updated lock

struct Lock<T> {
  locked: AtomicBool, // ***
  data: UnsafeCell<T>, // ***
}
unsafe impl<T> Sync for Lock<T> where T: Send {} // need to explicitly impl `Send`

impl<T> Lock<T> {
  pub fn new(data: T) -> Self { ... }
  pub fn lock<R>(&mut self, op: impl FnOnce(&mut T) -> R) -> R {
    // spin if we can't get lock
    while self
      .locked
      .compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed)
      .is_error() {}
    // execute the op as we got lock
    let ret = op(unsafe { &mut *self.v.get() }); // **3
    // unlock
    self.locked.store(false, Ordering::Release); // **4
    ret
  }
}

// You may call it like this:
let l = Lock::new(0);
l.lock(|v| v += 1);

What does ordering mean?

  • Relaxed: No restriction to compiler/CPU, OOO is allowed
  • Release:
    • for current thread, any read/write inst cannot be OOO after this inst (store);
    • for other thread, if they use Acquire to read, they would see the changed value
  • Acquire
    • for current thread, any read/write inst cannot be OOO before this inst (compare_exchange)
    • for other thread, if they use Release to update data, the modification would be see for current thread
  • AcqRel: combination of Acquire and Release
  • SeqCst: besides AcqRel, all threads would see same operation order.

Optimization


pub struct Lock<T> {
    locked: AtomicBool,
    data: UnsafeCell<T>,
}
unsafe impl<T> Sync for Lock<T> where T: Send {}

impl<T> Lock<T> {
  pub fn new(data: T) -> Self {...}
  pub fn lock<R>(&self, op: impl FnOnce(&mut T) -> R) -> R {
      while self
          .locked
          .compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed)
          .is_err()
      {
          while self.locked.load(Ordering::Relaxed) == true {
              std::thread::yield_now(); // we may yield thread now
          }
      }
      let ret = op(unsafe { &mut *self.data.get() });
      self.locked.store(false, Ordering::Release);
      ret
  }
}

This is basically how

Mutex

works

Real world Mutex

Semaphore

In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple processes and avoid critical section problems in a concurrent system such as a multitasking operating system. A trivial semaphore is a plain variable that is changed (for example, incremented or decremented, or toggled) depending on programmer-defined conditions.

A useful way to think of a semaphore as used in a real-world system is as a record of how many units of a particular resource are available, coupled with operations to adjust that record safely (i.e., to avoid race conditions) as units are acquired or become free, and, if necessary, wait until a unit of the resource becomes available.

Semaphore as a generalized Mutex

Demo code: restricted HTTP client

code/primitives/src/http_semaphore.rs

Channel

Channel basics

Flavours of Channels

  • sync: sender can block, limited capacity
    • Mutex + Condvar + VecDeque
    • Atomic VecDeque (atomic queue) + thread::park + thread::notify
  • async: sender cannot block, unbounded
    • Mutex + Condvar + VecDeque
    • Mutex + Condvar + DoubleLinkedList
  • rendezvous: sync with capacity = 0. Used for thread sync.
    • Mutex + Condvar
  • oneshot: only one call to send(). e.g. Ctrl+C to stop all threads
    • atomic swap
  • async/await
    • basically same as sync but waker is different

Demo code: naive MPSC

code/primitives/src/channel.rs

Demo code: naive actor

  • Questions:
    • Which type of channel shall we use? SPSC, SPMC, MPSC?
    • When creating an actor, what is its pid?
    • When sending a message to an actor, how the actor reply (handle_call)?
  • Code: code/primitives/src/actor.rs

References

Things to do with atomics

  • lock free data structure
  • in memory metrics
  • id generation

Concurrency - async/await

Using threads in Rust

use std::thread;

fn main() {
    println!("So we start the program here!");
    let t1 = thread::spawn(move || {
        thread::sleep(std::time::Duration::from_millis(200));
        println!("We create tasks which gets run when they're finished!");
    });

    let t2 = thread::spawn(move || {
        thread::sleep(std::time::Duration::from_millis(100));
        println!("We can even chain callbacks...");
        let t3 = thread::spawn(move || {
            thread::sleep(std::time::Duration::from_millis(50));
            println!("...like this!");
        });
        t3.join().unwrap();
    });
    println!("While our tasks are executing we can do other stuff here.");

    t1.join().unwrap();
    t2.join().unwrap();
}

Drawbacks of threads

  • large stack (not suitable for heavy loaded concurrent jobs - e.g. a web server)
  • context switch is out of your control
  • lots of syscall involved (costly when # of threads is high)

What are alternative solutions?

Green threads/processes

(stackful coroutine)

Golang/Erlang

Green Threads

  • Run some non-blocking code.
  • Make a blocking call to some external resource.
  • CPU "jumps" to the "main" thread which schedules a different thread to run and "jumps" to that stack.
  • Run some non-blocking code on the new thread until a new blocking call or the task is finished.
  • CPU "jumps" back to the "main" thread, schedules a new thread which is ready to make progress, and "jumps" to that thread.

Green Threads - pros/cons

  • Pros:
    • Simple to use. The code will look like it does when using OS threads.
    • A "context switch" is reasonably fast.
    • Each stack only gets a little memory to start with so you can have hundreds of thousands of green threads running.
    • It's easy to incorporate preemption which puts a lot of control in the hands of the runtime implementors.
  • Cons:
    • The stacks might need to grow. Solving this is not easy and will have a cost.
    • You need to save the CPU state on every switch.
    • It's not a zero cost abstraction (Rust had green threads early on and this was one of the reasons they were removed).
    • Complicated to implement correctly if you want to support many different platforms.

Poll based event loops

(stackless coroutine)

Javascript/Rust

Callback

setTimer(200, () => {
  setTimer(100, () => {
    setTimer(50, () => {
      console.log("I'm the last one");
    });
  });
});

Promise

function timer(ms) {
    return new Promise(
      (resolve) => setTimeout(resolve, ms)
    );
}

timer(200)
.then(() => timer(100))
.then(() => timer(50))
.then(() => console.log("I'm the last one"));

Async/Await

async function run() {
    await timer(200);
    await timer(100);
    await timer(50);
    console.log("I'm the last one");
}

The Rust approach

Demo: writing a server in rust

It uses async-prost, tokio and prost

One more thing...

An event store example

References

Networking and security

Network Stack

App for centralized network

App for p2p network

Rust Network Stacks

Demo: Build a TCP server (sync/async)

deps: stdlib / tokio

Demo: HTTP Client/Server

deps: reqwest / actix-web

Demo: gRPC

deps: prost / tonic

Steps to write a server

  • data serialization: serde / protobuf / flatbuffer / capnp / etc.
  • transport protocol: tcp / http / websocket / quic / etc.
  • security layer: TLS / noise protocol / secio / etc.
  • application layer: your own application logic

Network Security

TLS (skip)

Noise Protocol

Demo for noise protocol

References

FFI with C/Elixir/Swift/Java

Why do we need cross-lang interoperatability?

  • Use the right tool for the right problem
  • (For Rust) We may want to leverage C/++ libraries
  • (For other lang) We may want to bring Rust ecosystem to solve the problem
    • e.g. elixir doesn't have a good lib to generate cert, so we leverage rcgen from Rust
  • (For other lang) We may want to improve the performance for certain use cases
    • e.g. use rust to do computation intensive job in Python
  • avoid building same logic in multiple platforms

Rust to use C

extern crate libc;
use libc::size_t;

#[link(name = "snappy")]
extern "C" {
    fn snappy_max_compressed_length(source_length: size_t) -> size_t;
}

fn main() {
    let len = 10000;
    let result = unsafe { snappy_max_compressed_length(len) };
    println!("max compressed length for {} byte buffer: {}", len, result);
}

Rust to be used by

Elixir: Rustler

#[rustler::nif]
fn eval(policy: &str, app: &str, country: &str, platform: &str) -> bool {
    eval_policy(policy, app, country, platform)
}

rustler::init!("Elixir.PolicyEngine.Nif", [eval]);
Rustler version: 0.22.0-rc.1

Python: PyO3

use pyo3::prelude::*;
use pyo3::wrap_pyfunction;

#[pyfunction]
fn num_cpus() -> PyResult<usize> {
    Ok(num_cpus::get())
}

#[pymodule]
fn rust_utils(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(num_cpus, m)?)?;

    Ok(())
}

Nodejs: neon

use neon::prelude::*;

fn hello(mut cx: FunctionContext) -> JsResult<JsString> {
    Ok(cx.string("hello node"))
}

fn num_cpus(mut cx: FunctionContext) -> JsResult<JsNumber> {
    Ok(cx.number(num_cpus::get() as f64))
}

#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
    cx.export_function("hello", hello)?;
    cx.export_function("num_cpus", num_cpus)?;
    Ok(())
}

Java: robusta

use robusta_jni::bridge;
use robusta_jni::convert::Signature;

#[bridge]
mod jni {
    #[derive(Signature)]
    #[package(com.example.hello)]
    struct HelloWorld;

    impl HelloWorld {
        pub extern "jni" fn special(mut input1: Vec<i32>, input2: i32) -> Vec<String> {
            input1.push(input2);
            input1.iter().map(ToString::to_string).collect()
        }
    }
}

Things to consider

  • Avoiding throwing panics over the FFI (which is undefined behavior)
  • Translating rust errors (and panics) into errors that the caller on the other side of the FFI is able to handle
  • Converting strings to/from rust str
  • Passing non-string data back and forth between Rust and whatever the caller on the other side of the FFI is.
    • use the data structure conversion methods in target language (e.g. repr(C))
    • serialized with JSON / protobuf / flatbuffer / etc. (just like js-bridge!)

A realworld example for swift: Arch

Communication channel

PoC code (olorin)

References

WASM/WASI

Rust for real-world problems

May the Rust be with you