Improving IPC by Kernel Design

Thesis Information

Links: http://sci-hub.cc/10.1145/173668.168633
Author: Jochen Liedtke
German National Research Center for Computer Science (GMD)
citation: Jochen Liedtke. 1993. Improving IPC by kernel design. ACM SIGOPS Operating Systems Review SIGOPS Oper. Syst. Rev. 27, 5 (January 1993), 175–188. DOI:http://dx.doi.org/10.1145/173668.168633

Outline

Abstract

The IPC Dilemma
Related Work
L3 – The Workbench
Principle And Methods
A Concrete Design
5.1. Performance Objective
5.2. Architectureal Level
5.2.1 System Calls
5.2.2 Messages
5.2.3 Direct Transfer by Temporary Mapping
5.2.4 Strict Process Orientation
5.2.5 Control Blocks as Virtual Objects
5.3 Algorithmic Level
5.3.1 Thread Identifier
5.3.2 Handling Virtual Queues
5.3.3 Timeouts And Wakeups
5.3.4 Lazy Scheduling
5.3.5 Direct Process Switch
5.3.6 Short Messages Via Registers
5.4 Interface Level
5.4.1 Avoiding Unnecessary Copies
5.4.2 Parameter Passing
5.5 Coding Level
5.5.1 Reducing Cache Misses
5.5.2 Minimizing TLB Misses
5.5.3 Segment Registers
5.5.4 General Registers
5.5.5 Avoiding Jumps and Checks
5.5.6 Process Switch
5.6 Summary of Techniques
Results
Remarks
7.1 Introducing Ports
7.2 Dash-like Message Passing
7.3 Cache
7.4 Processor Dependencies
Conclusions

Acknowledgements

摘要 Abstract

論文認為 IPC 必須要快而且有效率，要不然程式設計師便很難使用 RPC、多線程或多任務的技巧。IPC 的效率尤其對 microkernel 至關重要，不過大多數的 microkernel 實作出來的 IPC 效能都很差，在 50 MHz 的電腦上傳送一個 short message 可以花費到 100 us。

作為對照，論文提出了可以達到 20 倍效率，也就是 5 us 的作法。

本論文提出了作法與其使用的原則，整個 approach 並不是一個簡單的 tricks 就可以做到，而是從設計到實作都必須要配合改進。論文以 L3 kernel 作為範例，證明論文提出的 approach 是以大幅改善 IPC 效能的。對比 Mach, 可以達到 22 倍(8-byte messages)跟 3 倍(4-Kbytes messages)之差。

結論 Conclusions

論文證明一個快速、跨 address space IPC 是可行的，透過實踐這些原則：

基於效能的推論 (performance based reasoning)
如果可以，使用新技術 (hunting for new techniques if necessary)
考慮到加成效果以及正確性 (consideration of synergetic effects and concreteness)

這些原則需要從各個層面來實踐，從架構到程式，同時在一開始就必須要有一個明確的效能目標。

這篇論文的作法可以適用到其他 microkernel 上。

這篇論文可實現的定量增益高達 22 倍，應可以被當作是一個定量的改進。

設計原則

IPC perfromance is the Master.
All design decisions require a performance discussion.
If something performs poorly, look for new techniques.
Synergetic effects have to be taken into consideration.
The design has to cover all levels from architecture down to coding.
The design has to be made on a concrete basis.
The design has to aim at a concrete performance goal.

目標

使用以下環境測試 IPC 效能：由 thread A 發送空訊息給 thread B (ready to recive)，兩個 thread 都運行在 user mode 並且存在於不同的 address space.

Summary of Techniques

add new system calls (5.2.1)
rich message structure (5.2.2)
symmetry of send & receive buffers (5.2.3)
single copy through temporary mapping (5.2.3)
kernel stack per thread (5.2.4)
control blocks held in virtual memory (5.2.5)
thread uid structure (5.3.1)
unlink tcbs from queues when unmapping (5.3.2)
optimized timeout bookkeeping (5.3.3)
lazy scheduling (5.3.4)
direct process switch (5.3.5)
pass short messages in register (5.3.6)
reduce cache misses (5.5.1)
reduce TLB misses (careful placement) (5.5.2)
optimize use of segment registers (5.5.4)
avoid jumps and checks (5.5.5)
minimize process switch activities (5.5.6)

Improving IPC by Kernel Design 閱讀摘要