Building a Rust Plugin Systems

Colliery Actual avatar
Colliery Actual
Cover for Building a Rust Plugin Systems

Building Rust Plugin Systems: A Pattern for Safe Dynamic Loading

Building a plugin system in Rust presents unique challenges. How do you load external code dynamically while maintaining Rust’s safety guarantees? How do you bridge async execution with C-compatible FFI?

The Rust community has evolved a pattern that solves these problems using macro-generated FFI, static metadata, and async runtime bridging. Here’s how this pattern works and how you can apply it to your own plugin systems.

The Big Picture: From Source to Execution

Plugin Development Flow:
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Plugin Dev    │    │   Compiler      │    │   .so File      │
│                 │    │                 │    │                 │
│ Write #[plugin_ │───▶│ Generate FFI    │───▶│ Compiled        │
│ module] code    │    │ symbols         │    │ library         │
└─────────────────┘    └─────────────────┘    └─────────────────┘


┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Host App      │    │  Plugin Loader  │    │  Async Runtime  │
│                 │    │                 │    │                 │
│ Register funcs  │◀───│ dlopen() +      │◀───│ Execute async   │
│ in registry     │    │ dlsym()         │    │ plugin code     │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       ▲
         │                       │                       │
         └───────────────────────┼───────────────────────┘

                    Call execute_function via FFI
                    Return JSON result via buffer

Step 1: Macro Code Generation

When you write this:

#[plugin_module]
pub mod data_processor {
    #[plugin_function]
    pub async fn process_data(input: &mut PluginContext) -> Result<(), PluginError> {
        // Your plugin logic here
    }
    
    #[plugin_function]
    pub async fn validate_data(input: &mut PluginContext) -> Result<(), PluginError> {
        // More plugin logic
    }
}

The macro generates two C-compatible export functions:

Metadata Export Function

#[no_mangle]
pub extern "C" fn get_plugin_metadata() -> *const PluginMetadata {
    &PLUGIN_METADATA  // Points to static data
}

Function Execution Function

#[no_mangle]
pub extern "C" fn execute_plugin_function(
    function_name: *const c_char,
    function_name_len: u32,
    input_json: *const c_char,
    input_len: u32,
    result_buffer: *mut u8,
    result_capacity: u32,
    result_len: *mut u32,
) -> i32 {
    // FFI boundary implementation
}

The key insight: all the async Rust code gets wrapped in sync FFI functions.

Step 2: Static Metadata Structures

The macro also generates static data structures that live for the program’s lifetime:

static FUNCTION_METADATA_ARRAY: [FunctionMetadata; 2] = [
    FunctionMetadata {
        index: 0,
        name: "process_data\0".as_ptr() as *const c_char,
        description: "Process input data\0".as_ptr() as *const c_char,
        signature: "fn process_data(input: PluginContext) -> Result<(), PluginError>\0".as_ptr() as *const c_char,
    },
    FunctionMetadata {
        index: 1,
        name: "validate_data\0".as_ptr() as *const c_char,
        description: "Validate input data\0".as_ptr() as *const c_char,
        signature: "fn validate_data(input: PluginContext) -> Result<(), PluginError>\0".as_ptr() as *const c_char,
    }
];

static PLUGIN_METADATA: PluginMetadata = PluginMetadata {
    function_count: 2,
    functions: FUNCTION_METADATA_ARRAY.as_ptr(),
    plugin_name: "data_processor\0".as_ptr() as *const c_char,
    version: "1.0.0\0".as_ptr() as *const c_char,
};

This static approach eliminates memory management complexity - no allocations, no cleanup needed.

Step 3: Dynamic Loading and Metadata Extraction

Here’s how the host application loads a compiled plugin:

// PluginLoader opens the .so file
let lib = unsafe { Library::new(library_path)? };

// Look up the metadata function
let get_metadata = unsafe {
    lib.get::<unsafe extern "C" fn() -> *const PluginMetadata>(
        "get_plugin_metadata".as_bytes()
    )?
};

// Call it to get the metadata pointer
let plugin_metadata = unsafe { get_metadata() };

// Convert C structures to Rust types
let metadata = unsafe { &*plugin_metadata };
let plugin_name = unsafe { CStr::from_ptr(metadata.plugin_name).to_str()? };

// Extract individual function metadata
let functions_slice = unsafe {
    std::slice::from_raw_parts(metadata.functions, metadata.function_count as usize)
};

The flow here is:

  1. dlopen() loads the library into memory
  2. dlsym() finds the metadata function by name
  3. Call the function to get a pointer to static metadata
  4. Walk the C structures to extract function information

Step 4: Function Execution Flow

When the host application needs to execute a plugin function, here’s the complete flow:

Function Execution Flow:

Host App     FFI Boundary    Async Bridge    Plugin Function
    │             │              │                │
    │ Call exec_  │              │                │
    │ function()  │              │                │
    └─────────────┼──────────────┼────────────────┘
                  │              │
                  ▼              ▼
              Parse C        Create tokio
              strings        Runtime
                  │              │
                  ▼              ▼
              Convert to     Execute async
              JSON Context   plugin logic
                  │              │
                  ▼              ▼
              Write JSON     Return Result
              to buffer      (success/error)
                  │              │
                  └──────────────┘


                    Return to Host
                    (status + JSON)

Step-by-step breakdown:

  1. Host calls execute_plugin_function("process_data", input_json)
  2. FFI Boundary parses function name and input from C strings
  3. Async Bridge creates tokio runtime and bridges to async world
  4. Plugin Function executes the actual async plugin logic
  5. Result flows back through the chain, converting to JSON
  6. Host receives status code and JSON result via buffer

The Execution Function Implementation

Here’s how the generated execution function actually works:

#[no_mangle]
pub extern "C" fn execute_plugin_function(
    function_name: *const c_char,
    function_name_len: u32,
    input_json: *const c_char,
    input_len: u32,
    result_buffer: *mut u8,
    result_capacity: u32,
    result_len: *mut u32,
) -> i32 {
    // 1. FFI Safety: Convert raw pointers to safe Rust
    let function_name_bytes = unsafe {
        std::slice::from_raw_parts(function_name as *const u8, function_name_len as usize)
    };
    let function_name_str = std::str::from_utf8(function_name_bytes)?;
    
    let input_bytes = unsafe {
        std::slice::from_raw_parts(input_json as *const u8, input_len as usize)
    };
    let input_str = std::str::from_utf8(input_bytes)?;

    // 2. JSON Deserialization: Recreate context from JSON
    let mut input_context = PluginContext::from_json(input_str.to_string())?;

    // 3. Async Runtime Creation: Bridge to async world
    let runtime = tokio::runtime::Runtime::new()?;

    // 4. Function Dispatch: Match function name to implementation
    let function_result = runtime.block_on(async {
        match function_name_str {
            "process_data" => data_processor::process_data(&mut input_context).await,
            "validate_data" => data_processor::validate_data(&mut input_context).await,
            _ => Err(format!("Unknown function: {}", function_name_str))
        }
    });

    // 5. Result Serialization: Convert back to JSON
    match function_result {
        Ok(()) => {
            let result_json = input_context.to_json()?;
            let result_value: serde_json::Value = serde_json::from_str(&result_json)?;
            write_success_result(&result_value, result_buffer, result_capacity, result_len)
        }
        Err(e) => {
            write_error_result(&format!("Function failed: {}", e), result_buffer, result_capacity, result_len)
        }
    }
}

Step 5: Buffer Management and Data Return

The FFI uses a simple buffer-passing protocol:

fn write_success_result(
    result: &serde_json::Value, 
    buffer: *mut u8, 
    capacity: u32, 
    result_len: *mut u32
) -> i32 {
    // Serialize result to JSON string
    let json_str = serde_json::to_string(result)?;
    let bytes = json_str.as_bytes();
    
    // Calculate how much we can write (truncate if needed)
    let len = bytes.len().min(capacity as usize);

    // Copy data to caller's buffer
    unsafe {
        std::ptr::copy_nonoverlapping(bytes.as_ptr(), buffer, len);
        *result_len = len as u32;
    }

    0 // Success code
}

The calling host application provides a buffer and gets back:

  • Status code (0 = success, -1 = error, -2 = critical error)
  • Actual length of data written
  • JSON data in the buffer

Key Architectural Decisions

Why Static Data Everywhere?

Using &'static str for all metadata eliminates complex lifetime management:

// This pointer is always valid - no cleanup needed
local_id: "collect_data\0".as_ptr() as *const c_char,

Instead of:

// This would require careful memory management
local_id: CString::new("collect_data")?.into_raw(),  // When do we free this?

Why Create Runtime Per Function Call?

let runtime = tokio::runtime::Runtime::new()?;
let function_result = runtime.block_on(async { /* ... */ });

This approach prioritizes simplicity and safety over maximum performance. Each function execution is completely isolated. For high-frequency plugin calls, you might want to share a runtime across calls, but that requires careful task isolation and cancellation handling.

Why JSON for Context Passing?

JSON provides a language-agnostic serialization format that crosses the FFI boundary safely:

// Host serializes input to JSON
let input_json = serde_json::to_string(&input)?;

// Passes JSON as bytes through FFI

// Plugin deserializes JSON back to input
let input = PluginContext::from_json(input_json)?;

This avoids complex binary serialization and struct layout compatibility issues.

Error Handling Across FFI

The system uses integer error codes with JSON error details:

fn write_error_result(error: &str, buffer: *mut u8, capacity: u32, result_len: *mut u32) -> i32 {
    let error_json = serde_json::json!({
        "error": error,
        "status": "error"
    });
    
    // Write error JSON to buffer same as success case
    let json_str = serde_json::to_string(&error_json)?;
    // ... copy to buffer
    
    -1  // Error status code
}

The host application checks the return code, then parses the buffer to get the result details (either success data or error information).

Limitations and Trade-offs

Performance

  • Creating a runtime per function call adds some overhead (most noticeable for high-frequency calls)
  • JSON serialization/deserialization on every call

Memory

  • Static metadata duplicated in every package
  • Runtime creation allocates significant memory
  • Buffer copying instead of zero-copy approaches

Safety

  • Still requires unsafe for FFI boundary (though isolated to the generated code)
  • ABI compatibility can be verified through version checks and symbol validation

Why This Pattern Works

Despite the trade-offs, this architecture provides several key benefits:

  1. Clear separation of concerns: FFI boundary isolates unsafe code to generated functions
  2. Async compatibility: Runtime bridge allows normal Rust async/await patterns
  3. Type safety: JSON provides schema validation and cross-language compatibility
  4. Compile-time validation: Macro system can validate dependencies and generate consistent interfaces
  5. Cross-platform compatibility: Standard C calling conventions work everywhere
  6. Plugin isolation: Each plugin runs in its own context with clear boundaries

The pattern scales well because:

  • Plugin authors write idiomatic Rust code
  • Host applications get a simple, consistent interface
  • The macro system handles all the complex FFI generation
  • Static data eliminates most memory management concerns
  • JSON provides a stable serialization format

Alternative: Interpreted Plugin Languages

Before diving into this Rust-to-Rust FFI approach, it’s worth considering an alternative pattern: using interpreted languages like Python as your plugin language. This involves embedding an interpreter (like Python via PyO3) in your Rust host application.

Rust FFI Plugins vs. Interpreted Plugins

Rust FFI Plugins (this pattern):

  • Performance: Near-native performance, compiled code
  • Safety: Compile-time validation, type safety
  • Distribution: Binary files, platform-specific compilation needed
  • Development: Requires Rust knowledge, full toolchain setup
  • Dynamic behavior: Limited - plugins are compiled

Python/Interpreted Plugins:

  • Performance: Slower due to interpretation overhead
  • Safety: Runtime validation, dynamic typing
  • Distribution: Plain text files, platform-independent
  • Development: More accessible, no compilation needed
  • Dynamic behavior: Highly dynamic - can modify behavior at runtime

When to Choose Each Pattern

Choose Rust FFI plugins when:

  • Performance is critical (high-frequency execution, computational work)
  • You want compile-time validation of plugin logic
  • Your plugin authors are comfortable with Rust
  • You need the plugin to integrate tightly with Rust’s type system
  • Plugin logic is relatively stable once written

Choose interpreted plugins when:

  • Developer experience and accessibility are priorities
  • Plugins need frequent modification without recompilation
  • Plugin logic is primarily business rules or simple data transformation
  • You want to allow non-Rust developers to write plugins
  • Rapid prototyping and iteration are important

Many systems benefit from a hybrid approach - use Rust FFI plugins for performance-critical components and interpreted plugins for business logic that changes frequently.

Applying This Pattern

Key considerations for your own implementation:

  • Design your plugin context/input types carefully - they define your API
  • Consider runtime sharing vs. isolation based on your performance needs
  • Plan your error handling strategy across the FFI boundary
  • Think about plugin discovery, loading, and lifecycle management
  • While macros can significantly improve the developer experience for Rust plugins, they’re not always the most friendly thing to build, debug, and maintain

The result is a system where plugin authors write normal async Rust code, while the host application can load and execute plugins dynamically with good safety guarantees and reasonable performance.