felixcarmona

Understanding fork(), vfork(), exec(), clone(), and more

September 27, 2024

If you’ve been working with processes in Linux, you’ve probably heard of fork(), vfork(), exec(), and clone(). These are just some of the tools Linux gives us to create and manage processes, and each one does things a little differently. Understanding how they work can really help you make your code more efficient.

In this post, I’ll break them down one by one with examples, so you can see how they work and when it makes sense to use them. Whether you’re running tasks in parallel, or just trying to save some memory, knowing how to handle processes better is always useful.

fork()

Creates a new process that’s a copy of the parent but runs independently.

If you’re writing software that spawns other processes—like servers or daemons—fork() is a basic tool. It duplicates the current process (the parent) to create a new one (the child). The two processes run independently, although they start from the same code. The parent process receives the child’s PID as the return value, while the child gets 0. If something goes wrong, the parent receives an error code, and no child is created.

Example:

#include <stdio.h>
#include <unistd.h>
 
int main() {
    pid_t pid = fork();
 
    if (pid == 0) {
        printf("This is the child process.\n");
    } else if (pid > 0) {
        printf("This is the parent process. Child PID: %d\n", pid);
    } else {
        printf("Fork failed!\n");
    }
 
    return 0;
}

vfork()

Temporarily shares memory between parent and child until exec() or _exit() is called.

If you’re working in environments with limited memory (like embedded systems), vfork() can be useful. It optimizes memory usage by sharing the parent’s memory space temporarily with the child. This call was designed for situations where you immediately follow fork() with exec(). The parent process is suspended until the child calls exec() or _exit(). While modern memory management techniques have reduced the need for vfork(), it still has its place in resource-constrained environments.

Example:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
 
int main() {
    pid_t pid = vfork();
 
    if (pid == 0) {
        printf("Child: Calling exec...\n");
        execl("/bin/ls", "ls", NULL);
        _exit(0);  // Use _exit() to avoid corrupting parent's stack
    } else if (pid > 0) {
        printf("Parent: Waiting for child to exec...\n");
    } else {
        printf("vfork failed!\n");
    }
 
    return 0;
}

exec()

Replaces the current process with a new program.

If you’re dealing with process creation, exec() is something you’ll probably use after a fork(). It doesn’t create a new process but replaces the current one with a new program. Once exec() is called, the old program is gone, and the new one starts running. This call is fundamental for running new programs, making it essential for anyone developing tools like shells or systems that manage multiple programs.

Example:

#include <stdio.h>
#include <unistd.h>
 
int main() {
    printf("Running exec example...\n");
    execl("/bin/ls", "ls", NULL);
    printf("This line will never be printed if exec() succeeds.\n");
 
    return 0;
}

clone()

Allows fine-grained control over which parts of the parent process are shared with the child.

If you’re building multithreaded applications, clone() is the system call that underpins many thread creation libraries, like pthread_create(). With clone(), you have more control over what gets shared between the parent and child, like memory, file descriptors, or signal handlers. This is vital when working on applications that need efficient task management, such as web servers or high-performance computing systems.

Example:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
 
int child_func(void *arg) {
    printf("Inside child process\n");
    return 0;
}
 
int main() {
    const int STACK_SIZE = 1024 * 1024;
    void *stack = malloc(STACK_SIZE);
 
    if (!stack) {
        perror("malloc");
        exit(1);
    }
 
    pid_t pid = clone(child_func, stack + STACK_SIZE, SIGCHLD, NULL);
 
    if (pid == -1) {
        perror("clone");
        exit(1);
    }
 
    printf("Parent: Child PID: %d\n", pid);
    free(stack);
 
    return 0;
}

posix_spawn()

Combines the functionality of fork() and exec() into a single call.

If you’re working in embedded systems or platforms without a Memory Management Unit (MMU), posix_spawn() offers an efficient way to create a new process and run a program in one call. It’s a simpler, more portable option compared to fork(), especially when you need to conserve memory and CPU cycles. It’s often used in systems where performance matters more than flexibility.

Example:

#include <stdio.h>
#include <spawn.h>
#include <sys/wait.h>
#include <stdlib.h>
 
int main() {
    pid_t pid;
    char *argv[] = {"/bin/ls", NULL};
    extern char **environ;
 
    if (posix_spawn(&pid, "/bin/ls", NULL, NULL, argv, environ) == 0) {
        printf("Spawned child with PID: %d\n", pid);
        waitpid(pid, NULL, 0);
    } else {
        printf("posix_spawn failed!\n");
    }
 
    return 0;
}

pthread_create()

Creates a new thread that shares the same memory space with the parent.

If you’re dealing with concurrency or parallel programming, pthread_create() is your go-to. It’s used to create threads, which are lighter than processes and share the same memory space. This is key for writing efficient, multithreaded applications, especially in performance-critical areas like gaming, databases, or server applications.

Example:

#include <stdio.h>
#include <pthread.h>
 
void *thread_func(void *arg) {
    printf("Hello from the thread!\n");
    return NULL;
}
 
int main() {
    pthread_t thread;
 
    if (pthread_create(&thread, NULL, thread_func, NULL) != 0) {
        printf("Error creating thread\n");
        return 1;
    }
 
    pthread_join(thread, NULL);
    printf("Thread finished executing\n");
 
    return 0;
}

Share on 𝕏


Get updates on new posts