Skip to content

Thread

Definition

  • Thread is the basic unit of execution within a process
  • Each thread has its own
    • thread ID
    • program counter
    • register set
    • Stack
  • 共有的东西
    • code section
    • data section
    • the heap (dynamically allocated memory)
    • open files and signals
  • 多线程进程可以同时做很多事情

alt text

线程的优势

  • 创建一个线程比进程便宜
  • 线程直接的 context-switch 比进程便宜
  • 线程间共用内存,不需要 IPC
  • Powerful(存在危险
  • Responsiveness(可以响应很多活动
  • Scalability(在多核机器上发挥更好

缺点

  • Weak isolation - 如果一个线程 segfault,整个程序 fails
  • leads to process-based concurrency
  • memory-constrained
  • memory protection 不起作用

多线程带来的挑战

  • 数据依赖、同步
  • 将任务和数据分成多个活动分给线程
  • Balancing load among threads
  • 测试和 debug

User Threads vs. Kernel Threads

  • 可以只在用户模式下支持线程(managed by some user-level thread library (e.g., Java Green Threads)
  • 也可以由 Kernel 支持(需要 data structure and functionality
    • linux 没有对线程和进程的数据结构做出区分·

Many-to-One Model

  • 一个 kernel thread 对应多个 user thread
  • 缺点:不能很好地利用多核架构/如果一个线程被阻塞,所有都运行不了
  • example - Java Green Threads/GNU Portable Threads

One-to-One Model

现在用的最多

简单+硬件便宜

  • 一个 kernel thread 对应一个 user thread
  • 可以消除 Many-to-One Model 的缺点
  • 耗资源,创建新线程要创建新的 kernel 线程,需要更多时间
  • context 切换时,存在对应的 kernel space(tk
  • Linux,Windows,Solaris 9 and later 用的都是这种模式

其他不常见的模式

  • Many-to-Many - 当一个线程被阻塞,创建新的 kernel 线程防止其他线程终止;一个线程的创建不一定需要新的 kernel 线程
  • Two-Level - The user can say: “Bind this thread to its own kernel thread”

Thread Library

  • 为用户提供在自己程序中创建进程的方式
  • In C/C++: pthreads and Win32 threads - 由 kernel 实现
    • pthreads - Specification, not implementation
  • In C/C++: OpenMP
    • Identifies parallel regions
    • pragma omp parallel 有多少核创建多少个进程
  • Java Thread
    • JVM 实现,追踪线程状态,进行 Schedule
    • Old versions of the JVM used Green Threads(不能多核)alt text
    • The JVM now provides native threads(mapped to kernel threads)alt text

Threading Issues

Semantics of fork() & exec() system calls

  • 多线程的情况下,call fork() 有两种可能
    • 一个新的进程被创建,只有一个线程(即 call fork() 的那个线程的拷贝)
    • 一个新的进程被创建,源进程的所有线程都被 copy
  • 有些 OS 提供两种选项(Linux 使用第一种
  • If one calls exec() after fork(), all threads are “wiped out” anyway

Signal handling

  • 多线程的情况下,call signal() 有很多种选择
    • Deliver the signal to the thread to which the signal applies
    • Deliver the signal to every thread in the process
    • Deliver the signal to certain threads in the process
    • Assign a specific thread to receive all signals
  • Most UNIX versions: a thread can say which signals it accepts and which signals it doesn’t accept
  • On Linux - tricky

Thread cancellation of target thread

  • Asynchronous - One thread terminates another immediately
  • deferred - A thread periodically checks whether it should terminate
  • If thread has cancellation disabled (off), cancellation remains pending until thread enables it
  • Cancellation only occurs when thread reaches cancellation pointpthread_testcancel() in pthread), Then cleanup handler is invoked

Operating System Examples

  • Windows Threads alt text
  • Linux Threads
    • clone() syscall 来创建线程/进程
    • Shares execution context with its parent
    • 多线程进程,PID 为 leading thread ID,其他存在链表中alt textalt text
  • User thread to kernel thread mapping

最后一点没看懂啊,感觉不重要


Inter-Process Communications(IPCs)

process 可能是 independent or cooperating 的

为什么合作 - Information sharing/Computation speedup/Modularity/Convenience

real world example - chorme

Models of IPC

  • Message passing
    • 适合小数据
    • 实现容易
    • 有时对用户来说很麻烦,因为代码中散布着 send/recv 操作
    • high-overhead - 每次都要 syscall
  • Shared memory
    • low-overhead - 只需要最开始 syscall 初始化,后续不需要
    • 用户使用简单(只需要读写 RAM
    • 难实现
  • Signal
  • Pipe
  • Socket

alt text

大部分 os 两种都用


Shared Memory

  • 进程建立一个共享内存区域(segment)然后 attatch to their address space(与内存隔离原则相悖
  • 通过读写这个内存区域实现信息交流(进程自己负责不产生冲突,os 不负责

Example - Bounded buffer

alt text

POSIX


Message Passings

  • Two fundamental operations:
    • send (P, message) – send a message to process P
    • receive(Q, message) – receive a message from process Q
  • 过程
    • 建立 “link”(可以用多种方式实现
    • calls to send() and recv()
    • 可以关掉 “link”
  • Message passing is key for distributed computing(不同主机不能共享内存

Physical:

  • Shared memory
  • Hardware bus
  • Network

Logical:

  • Direct or indirect
    • Direct - 必须显式标明收发信息的线程
    • indirect - mailbox (ports)
  • Synchronous or asynchronous
    • Blocking is considered synchronous
    • Non-blocking is considered asynchronous
  • Automatic or explicit buffering
    • Zero capacity/Bounded capacity/Unbounded capacity

实例

  • Signals are a UNIX form of IPC
  • Pipes
    • Ordinary pipes(anonymous pipes in WIN) - 需要是父亲和孩子才能通过管道传递信息
    • Named pipes
    • UNIX pipe 是单边的, The command ls | grep foo creates two processes that communicate via a pipe
      • The ls process writes on the write-end
      • The grep process reads on the read-end
  • Client-Server
    • Socket = ip address + port number
  • Remote Procedure Calls - done by a client stub
  • RPCs
  • RMI (JAVA)

Messages are directed and received from mailboxes (also referred to as ports)

  • Each mailbox has a unique id
  • Processes can communicate only if they share a mailbox

pipe - parent 传递给 child

  • 一个进程至少有一个线程
  • share memory