|
Concurrency issues
Modern operating systems support multiple concurrent tasks, concurrent computing while improving resource utilization has also brought the resource competition. For example, C language statement "count ++;" assembly code generated when the compiler optimization is not.
When the operating system memory execute the code in multiple processes at the same time, it is possible to bring concurrency issues.
Suppose count variable is initialized to zero. A process executing the "mov eax, [count]", the eax register holds the count value of 0. In this case, the process 2 is scheduled for execution, to seize control of the CPU 1 of the process. Process 2 Run "count ++;" assembly code, the count value accumulated after 1 is written back to memory. Then, the process 1 is again scheduled for execution, CPU control back to process 1. 1 process is then performed to calculate the cumulative value of the count is still 1, written back to memory. Although the processes 1 and 2 performed twice "count ++;" operation, but the actual count memory is 1, not 2!
Uniprocessor atomic operations
Solve this problem is to "count ++;" statement is translated to a single instruction.
Intel x86 instruction set supports inc operation memory operand, and "count ++;" the operation can be completed in one instruction. Because the context of the process of switching is always after an instruction is executed, it does not occur concurrency issues. For single processors, a processor instruction is an atomic operation.
Multiprocessor atomic operations
However, in a multi-processor environment, such as SMP architecture, this conclusion is no longer valid. We know that during the execution of "inc [count]" instruction is divided into three steps:
1) read from memory count data to the cpu.
2) the cumulative value read.
3) modify the value of the writeback count memory.
This goes back to the previous concurrency issues similar situation, only this time the theme is no longer a concurrent process, but the processor.
Intel x86 instruction set provides instructions prefix lock for locking the front end serial bus (FSB), to ensure that the other processors will not be disturbed when the instruction is executed.
Use lock instruction prefix, the inter-processor count concurrent access to memory (read / write) is prohibited, thus ensuring the atomic instruction.
x86 implement atomic operations
Linux source code is defined in the file structure of the atom for x86 operating system.
linux2.6 / include / asm-i386 / atomic.h
Document defines the atomic type atomic_t, it is only a field counter, for storing 32-bit data.
typedef struct {volatile int counter;} atomic_t;
Wherein the atomic manipulation functions atomic_inc complete self-imposed atomic operation.
/ **
* Atomic_inc - increment atomic variable
* @v: Pointer of type atomic_t
*
* Atomically increments @v by 1.
* /
static __inline__ void atomic_inc (atomic_t * v)
{
__asm__ __volatile __ (
LOCK "incl% 0"
: "= M" (v-> counter)
: "M" (v-> counter));
}
Which is defined as a macro LOCK.
#ifdef CONFIG_SMP
#define LOCK "lock;"
#else
#define LOCK ""
#endif
Seen in the case of a symmetric multi-processor architecture, LOCK is interpreted as a command prefix lock. For single-processor architecture, LOCK does not contain any content.
arm atomic operations to achieve
In the arm instruction set, there is no instruction prefix lock, how to complete the atomic operation that it?
Linux source code is defined in the file system of the operating arm structure of the atom is.
linux2.6 / include / asm-arm / atomic.h
Wherein the self-imposed by the atomic function atomic_add_return implementation.
static inline int atomic_add_return (int i, atomic_t * v)
{
unsigned long tmp;
int result;
__asm__ __volatile __ ( "@ atomic_add_return n"
"1: ldrex% 0, [% 2] n"
"Add% 0,% 0,% 3 n"
"Strex% 1,% 0, [% 2] n"
"Teq% 1, # 0 n"
"Bne 1b"
: "= & R" (result), "= & r" (tmp)
: "R" (& v-> counter), "Ir" (i)
: "Cc");
return result;
}
The actual form of the above is embedded assembly.
1:
ldrex [result], [v-> counter]
add [result], [result], [i]
strex [temp], [result], [v-> counter]
teq [temp], # 0
bne 1b
ldrex instruction v-> counter value transferred to the result, and set the global mark "Exclusive".
add instruction completion "result + i" of the operation and saves the addition result to result.
strex instruction detected first global tag "Exclusive" exists, if present, the value will result writeback counter-> v, and the temp is set to 0, clear the "Exclusive" tag, or directly to end temp is set to 1.
teq directive test temp value is 0.
bne instruction is not equal to zero temp jump to label 1, wherein the character b represents a jump backwards.
Overall, the above-described assembly code has been trying to complete "v-> counter + = i" operation until the end of temp is 0:00.
Use ldrex and strex instructions can guarantee whether atomic add command it? Suppose two processes concurrently executing "ldrex + add + strex" operation, when the process execution ldrex 1 set global tag "Exclusive". At this point the process of switching to 2, before performing ldrex global tag "Exclusive" has been set, after ldrex repetitive set the mark. Then add execute instructions and strex complete accumulate operations. Switch back to process 1 again, and then execute add instructions that, when executed strex instruction, since the "Exclusive" flag is cleared Process 2, and therefore does not perform the transfer operation, the temp is set to 1. Teq measured temp successor instruction is not equal to 0, then jump to the beginning of re-execute the eventual accumulation operation! Visible ldrex and strex instructions can ensure the synchronization between processes. Multiprocessor with this same, because the operating arm of the atomic concerned only "Exclusive" tag, but not with the front end of the serial bus is locked.
Prior to ARMv6, swp instruction bus is accomplished by locking the data exchange atoms, but the impact on system performance. After ARMv6, and general use ldrex strex instruction instead of swp instruction function.
Spinlock atomic operations
Linux source code in the x86 architecture spinlock definition file.
linux2.6 / include / asm-i386 / spinlock.h
Wherein __raw_spin_lock complete the spin-lock lock function
#define __raw_spin_lock_string
" N1: t"
"Lock; decb% 0 n t"
"Jns 3f n"
"2: t"
"Rep; nop n t"
"Cmpb $ 0,% 0 n t"
"Jle 2b n t"
"Jmp 1b n"
"3: n t"
static inline void __raw_spin_lock (raw_spinlock_t * lock)
{
__asm__ __volatile __ (
__raw_spin_lock_string
: "= M" (lock-> slock):: "memory");
}
The actual form of the above code is compiled.
1:
lock decb [lock-> slock]
jns 3
2:
rep nop
cmpb $ 0, [lock-> slock]
jle 2
jmp 1
3:
Wherein the lock-> slock field an initial value of 1, after performing an atomic operation decb zero. The sign bit is 0, execution jumps to 3 jns instruction to complete the spin lock lock.
When applying spin lock again, after performing an atomic operation decb lock-> slock -1. The sign bit is 1, the instruction is not executed jns. Access to the label 2, Comparative lock- after performing a set of instructions nop> if slock less than or equal to 0, if less than or equal to 0 back label 2 cycle (spin). Otherwise, jump to label 1 reapply spin lock until the application is successful.
When the spin-lock release will lock-> slock set to 1, thus ensuring that other processes can acquire spin lock.
Semaphore atomic operations
Linux source code in the x86 architecture spinlock definition file.
linux2.6 / include / asm-i386 / semaphore.h
Requesting a semaphore implemented by the function down.
/ *
* This is ugly, but we want the default case to fall through.
* "__down_failed" Is a special asm handler that calls the C
* Routine that actually waits. See arch / i386 / kernel / semaphore.c
* /
static inline void down (struct semaphore * sem)
{
might_sleep ();
__asm__ __volatile __ (
"# Atomic down operation n t"
LOCK "decl% 0 n t" / * --sem-> count * /
"Js 2f n"
"1: n"
LOCK_SECTION_START ( "")
"2: tlea% 0, %% eax n t"
"Call __down_failed n t"
"Jmp 1b n"
LOCK_SECTION_END
: "= M" (sem-> count)
:
: "Memory", "ax");
}
The actual assembly code form.
lock decl [sem-> count]
js 2
1:
<========== Another section ==========>
2:
lea [sem-> count], eax
call __down_failed
jmp 1
Semaphore sem-> count Usually initialized to a positive integer, an atomic semaphore operation decl when applying the sem-> count minus one. If the value is reduced to negative (sign bit is 1) jump to another label segment 2, or the application semaphore success.
2 tag is compiled into another section, enter the label 2, the Executive lea instruction fetch sem-> count address into the eax register as an argument, and then calls the function __down_failed signal representing the amount of the application fails, the process added to the waiting queue. Finally, jump back to the end of the tag 1 semaphore applications.
Semaphore release operation is realized by the function up.
/ *
* Note! This is subtle. We jump to wake people up only if
* The semaphore was negative (== somebody was waiting on it).
* The default case (no contention) will result in NO
* Jumps for both down () and up ().
* /
static inline void up (struct semaphore * sem)
{
__asm__ __volatile __ (
"# Atomic up operation n t"
LOCK "incl% 0 n t" / * ++ sem-> count * /
"Jle 2f n"
"1: n"
LOCK_SECTION_START ( "")
"2: tlea% 0, %% eax n t"
"Call __up_wakeup n t"
"Jmp 1b n"
LOCK_SECTION_END
".subsection 0 n"
: "= M" (sem-> count)
:
: "Memory", "ax");
}
The actual assembly code form.
lock incl sem-> count
jle 2
1:
<========== Another section ==========>
2:
lea [sem-> count], eax
call __up_wakeup
jmp 1
Perform atomic operations incl releasing the semaphore will sem-> count plus 1, if the value is less than or equal to 0, then the wait queue has blocked processes need to wake up, jump to label 2, otherwise the success of the release of the semaphore.
2 tag is compiled into another section, enter the label 2, the Executive lea instruction fetch sem-> count address into the eax register as an argument, and then calls the function __up_wakeup wakeup waiting process queue. Finally, jump back to the end of a label to release the semaphore.
To sum up
Through the realization of the principle of the operating system concurrency issues discussed Research within the operating system atomic operations, and discusses the implementation of different architectures Linux atomic operations, and finally describes the Linux operating system how to use atomic operations to achieve a common process synchronization mechanism, I hope for your help. |
|
|
|