Skip to content

Synchronization

会出一道大题,据说比生产者和消费者还简单(什么

Background

  • 程序可能被打断
  • 同时访问共享的数据可能造成数据不一致性

例子:两个线程交替执行 counter ++,结果比 loop 数少很多

因为 counter++ 不是原子操作,在 ++ 过程中可能发生这种情况:alt text

Race Condition

  • Definition - Several processes (or threads) access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place
  • Race Condition in Kernel - 两个进程同时 fork,会得到同一个 pid

Critial-Section

  • General structure of process alt text
  • 只有 Critial-Section 会 change common variables, update table, write file, etc.
  • 因此只能有一个 Critial-Section 在运行
    • when one process in critical section, no other may be in its critical section
    • each process must ask permission to enter critical section in entry section
    • the permission should be released in exit section
  • Critical-Section Handling in OS
    • Single-core system: preventing interrupts
    • Multiple-processor: preventing interrupts are not feasible
    • preemptive or non-preemptive
  • 三个要求(重要)
    • multual exclusive 互斥访问
    • Progress 空闲让进
    • bounded-waiting 有限等待

Peterson solution

solves two-processes/threads synchronization

  • It assumes that LOAD and STORE are atomic
  • Two processes share two variables
    • boolean flag[2]: whether a process is ready to enter the critical section
    • int turn: whose turn it is to enter the critical section

alt text alt text

  • 每种情况下都满足三个条件

多线程乱序情况下,结果可能不一致


Hardware Support for Synchronization

Uniprocessors: disable interrupts(多核只需要 disable 一个核的 int)

Memory Barriers(不要求掌握)

Hardware Instruction

Test-and-Set Instruction

  • atomically
  • Set and return old value
bool test_set (bool *target)
{
    bool rv = *target;
    *target = TRUE;
    return rv;
}

用这个原子操作防止打断:

bool lock = FALSE
do {
    while (test_set(&lock));   // busy wait
    critical section
    lock = FALSE;
    remainder section 
} while (TRUE);

不能满足 bounded-waiting alt text 改进:轮流传递锁

alt text


Compare-and-Swap Instruction

现实实现了

  • x86 汇编里的 xchg/ARM
  • 当旧值和预期相同(没被其他进程改动过),才赋予新值
  • 返回旧值
int compare_and_swap(int *value, int expected, int new_value)
{
    int temp = *value;
    if (*value == expected)
        *value = new_value;
    return temp;
}
  • 如果 store 前别的进程改动了数据,则当前进程的改动回滚
while (true)
{
    while (compare_and_swap(&lock, 0, 1) != 0)
    ; /* do nothing */
    /* critical section */
    lock = 0;
    /* remainder section */
}

Atomic Variables

  • increment(&sequence); - 保证自增时不被打断
void increment(atomic_int *v) {
    int temp;
    do {
        temp = *v;
    } while (temp != (compare_and_swap(v,temp,temp+1)));
}

Mutex Locks

等同于 spin lock

  • Protect a critical section by first acquire() a lock then release() the lock
  • Calls to acquire() and release() must be atomic (硬件实现)
  • 需要 busy waiting
while(true){
    acquire lock// busy waiting

    critical section// busy waiting

    release lock// busy waiting

    remainder section
}
definition
bool locked = false;
acquire() {
    while (compare_and_swap(&locked, false, true))
    ; //busy waiting
}
release() {
    locked = false;
} 

问题:太多的 spin 浪费 CPU 时间

t0 获得锁,schedule 让 t1 执行,t1 的时间片被用来 busy waiting

解决方法:让没有取得锁的进程 sleep,不放在 ready queue 里 - Semaphore

definition
bool locked = false;
acquire() {
    while (test_set(&flag, 1) == 1)
        yield(); // give up the CPU
}

Semaphore

  • 多了一个 waiting queue
  • When the lock is locked, change process’s state to SLEEP, add to the queue, and call schedule()
  • Contain S – integer variable,用来表示资源数
    • 只能通过两个不可见的原子操作 wait() and signal() (Originally called P() and V() Dutch) 访问
      wait(S) { 
          while (S <= 0) ; // busy wait
          S--;
      }
      signal(S) { 
          S++;
      }
      
  • 类型
    • Counting Semaphore - S 可以有很大范围
    • Binary Semaphore - S 只有 0/1(与互斥锁相同
  • Two operations:
    • Block - 将进程 sleep
    • wakeup - 将一个进程放入 ready queue
typedef struct { 
    int value; 
    struct list_head * waiting_queue; 
} semaphore;

void init(){
    flag = 0;
}

void lock(){

}

wait(semaphore *S){
    S->value--;
    if(S->value < 0){
        add this process to S->list;
        block();
    }
}

signal(semaphore *S){
    S->value++;
    if(S->value <= 0){
        remove a proc.P from S->list;
        wakeup(P);
    }
}

wait/signal 需要是 atomic 的(用 spinlock)

原因是多个进程 value++/-- 会出错

因此存在 busy waiting

用法
Semaphore sem; //  initialized to 1
do {
    wait (sem);  
    critical section // No busy waiting on critical section
    signal (sem); 
    remainder section
} while (TRUE);   //while loop but not busy waiting

T1.wait(&sem);
T0.signal(&sem);

Comparison between mutex and semaphore

  • Mutex or spinlock
    • no blocking
    • Waste CPU on looping
    • short critical section
  • Semaphore
    • no looping
    • context switch is time-consuming
    • long critical section

deadlock -> starvation

Priority Inversion

  • 先获得锁的低优先级的进程比高优先级的先执行
  • solution:priority inheritance
    • 把等待队列种的最高优先级暂时赋给该进程

Linux Synchronization

atomic integers

spinlock

read-write lock/condition variable - 不要求掌握

reproduce

semaphores

  • on single-cpu system, spinlocks replaced by enabling/disabling kernel preemption
  • down(), up()

reader-writer locks

POSIX - userspace 的实现


Example

有一道大题,程序填空

Bounded-Buffer Problem

问题:两个进程,生产者和消费者,共享 n 个 buffer

  • 生产者往 buffer 里放 item
  • 消费者从 buffer 中拿 item
  • 保证生产者不会往满的 buffer 里放东西,消费者不会从空的 buffer 里拿

变量定义

  • n buffers, each can hold one item
  • semaphore mutex initialized to the value 1
  • semaphore full-slots initialized to the value 0
  • semaphore empty-slots initialized to the value N
producer
do{
//produce an item
    ···
    wait(empty-slots);
    wait(mutex);
    //add the item to the buffer
    ...
    signal(mutex);
    signal(full-slots);
}while(true)
consumer
do {
    wait(full-slots);
    wait(mutex);
    //remove an item from  buffer
    
    signal(mutex);
    signal(empty-slots);
    //consume the item
    
} while (TRUE);

不能换 - death-sleep

拥有资源的程序被 sleep,另一个程序 busy waiting

Readers-Writers Problem

问题:数据集被多个进程共享

  • readers: only read the data set; they do not perform any updates
  • writers: can both read and write
  • 目的:允许多个 reader shared access,只允许一个 writer exclusive access

变量定义

  • semaphore mutex 初始化为 1
  • semaphore write 初始化为 1
  • integer readcounter 初始化为 0
write
do{
    wait(write);
    // write the shared data
    ...
    signal(write);
} while(true)
reader
do{
    wait(mutex);
    readcount++;
    if(readcount == 1)//first reader block write
        wait(write);// read 与 write 互斥
    signal(mutex);

    //reading data
    ...
    wait(mutex);
    readcount--;
    if(readcount == 0)
        signal(write);// 最后一个 release 的 reader 释放 write 锁
    signal(mutex);
}while(true);

不同的优先级

  • Reader first(上面的实现
  • Writer first - 新的读请求 pending

Dining-Philosophere Problem

模拟 multi-resource synchronization

哲学家围坐在餐桌旁,需要两个筷子吃饭,吃完了放下

  • 先后拿起筷子 alt text

Solution (assuming 5 philosophers):

  • semaphore chopstick[5] initialized to 1
do  { 
    wait(chopstick[i]);
    wait(chopstick[(i+1)%5]);
    eat
    signal(chopstick[i]);
    signal(chopstick[(i+1)%5]);
    think
} while (TRUE);
  • 如果同时拿起一边的筷子,会陷入死锁

这个比较难,不要求会解决


Deadlock

Deadlock problem

Deadlock: a set of blocked processes each holding a resource and waiting to acquire a resource held by another process in the set

Bridge Crossing Example

  • 只能单向通行,每个车道可以看成一个 resource
  • 死锁发生时,可以通过车辆掉头解决 - 资源抢占和回滚
  • 大多数操作系统不解决死锁 alt text
/* thread one runs in this function */
void*do_work_one(void*param){
    pthread_mutex_lock(&first_mutex);
    pthread_mutex_lock(&second_mutex);
    /* Do some work*/
    pthread_mutex_unlock(&second_mutex);
    pthread_mutex_unlock(&first_mutex);
    pthread_exit(0);
}
/* thread two runs in this function */
void*do_work_two(void*param){
    pthread_mutex_lock(&second_mutex);
    pthread_mutex_lock(&first_mutex);
    /* Do some work*/
    pthread_mutex_unlock(&first_mutex);
    pthread_mutex_unlock(&second_mutex);
    pthread_exit(0);
}

resource allocation graph: alt text


System model

死锁的四个条件

resource R1, R2, ..., Rm 分别有 W1, W2, ..., Wn 个

  • Mutual exclusion: a resource can only be used by one process at a time
  • Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes
  • No preemption: a resource can be released only voluntarily by the process holding it, after it has completed its task
  • Circular wait: there exists a set of waiting processes {P0, P1, …, Pn}
    • P0 is waiting for a resource that is held by P1
    • P1 is waiting for a resource that is held by P2 …
    • Pn–1 is waiting for a resource that is held by Pn
    • Pn is waiting for a resource that is held by P0

Resource-Allocation Graph

  • request edge: directed edge Pi ➞ Rj
  • assignment edge: directed edge Rj ➞ Pi

alt text

circular wait does not necessarily lead to deadlock

  • 没环一定没死锁
  • 有环时
    • 如果每个资源只有一个,一定死锁
    • 有多个,可能死锁

alt text

Handling deadlocks

Ensure that the system will never enter a deadlock state (prevention 或者 avoidance)

Allow the system to enter a deadlock state and then recover - database (detection 和 recovery)

ignore

deadlock prevention

打破死锁的四个条件

  • prevent mutual exclusion - 不请求共用资源
  • prevent hold and wait
    • 在运行前分配所有所需资源
    • 请求资源时不能有资源
    • 资源利用率低,会饥饿
  • handle no preemption
    • 请求资源 not available 时将所有该资源释放
    • preempted resources are added to the list of resources it waits for
    • process will be restarted only when it can get all waiting resources
  • handle circular wait
    • 为资源排序
    • 只能以升序请求资源
    • 很多 os 用这个

deadlock avoidance

require extra information about how resources are to be requested

  • Each process declares a max number of resources it may need
  • Deadlock-avoidance algorithm ensure there can never be a circular-wait condition
  • Resource-allocation state
    • the number of available and allocated resources
    • the maximum demands of the processes

safe state - for each Pi, resources that Pi can still request can be satisfied by currently available resources + resources held by all the Pj

  • safe state 一定不死锁
  • unsafe state 可能死锁
  • deadlock avoidance - 保证不进入 unsafe state

deadlock avoidance algorithm

  • Single instance of each resource type ➠ use resource-allocation graph
    • claim edge Pi ➞ Rj indicates that process Pi may request resource R
    • resources must be claimed a priori in the system alt text
  • Multiple instances of a resource type ➠ use the banker’s algorithm

Banker's Algorithm

alt text

deadlock detection

  • Maintain a wait-for graph, nodes are processes
  • Pi➞Pj if Pi is waiting for Pj

alt text

detect 到了环路 -> 死锁

算法:

  • 一个每次走一步,一个每次走两步
  • 走两步的遇到了前者,则存在环
  • \(O(n^2)\)

deadlock recovery

kill process

  • 把所有死锁环里的程序都 kill
  • 一次 kill 一个,看死锁什么时候消除
    • Select a victim
    • Rollback
    • 可能会 starvation