2014-06-15

作業系統筆記 (2)

精簡

Page-replacement algorithm

FIFO Algorithm
最先載入的 Page (即：Loading Time最小者)，優先視為 Victim Page。
Optimal Algorithm(OPT)
以 “將來長期不會使用的 Page Page” 視為Victim Page。
Least Recently Used Algorithm (LRU)
以 “最近不常使用的 Page Page” 視為Victim Page。
- 緣由：LRU製作成本過高
- 作法：
  - Second econd Chance ( 二次機會法則 )
  - Enhance nhance Second Chance ( 加強式二次機會法則 )
- 有可能退化成 FIFO FIFO，會遇到 Belady 異常情況
Second Chance (二次機會法則)
以 FIFO 法則為基礎，搭配 Reference Bit 使用，參照過兩次以上的 page 將不會被置換出去，除非全部都是參照兩次以上，將會退化成 FIFO。
Enhance Second Chance (加強式二次機會法則)
使用(Reference Bit Bit, Modification Bit Bit) 配對值作為挑選 Victim Page 的依據。對於沒有沒修改的 page 優先被置換，減少置換的成本。
Belady’s anomaly 當 Process 分配到較多的 Frame 數量，有時其 Page Fault Ratio 卻不降反升。

Rank	Algorithm	Suffer from Belady’s anomaly
1	Optimal	no
2	LRU	no
3	Second-chance	yes
4	FIFO	yes

Thrashing(振盪)

當 CPU 效能低時，系統會想引入更多的 process 讓 CPU 盡可能地工作。但當存有太多 process 時，大部分的工作將會花費在 page fault 造成的 Page Replacement，致使 CPU 效率下降，最後造成 CPU 的效能越來越低。

解決方法

[方法 1]
降低 Multiprogramming Degree。
[方法 2]
利用 Page Fault Frequency (Ratio) 控制來防止 Thrashing。
[方法 3]
利用 Working Set Model “預估 ” 各 Process 在不同執行時期所需的頁框數，並依此提供足夠的頁框數，以防止 Thrashing。
- 作法
  OS 設定一個Working Set Window (工作集視窗；Δ) 大小，以Δ 次記憶體存取中，所存取到的不同Page之集合，此一集合稱為 Working Set 。而Working Set中的Process個數，稱為 Working Set Size ( 工作集大小； WSS) 。

Fragmentation

分成外部和內部碎裂 External & Internal Fragmentation

Deadlock(死結)

成因
1. mutual exclusion 資源互斥
  一個資源只能分配給一個 process，不能同時分配給多個 process 同時使用。
2. hold-and-wait 擁有並等待
  擁有部分請求的資源，同時等待別的 process 所擁有資源
3. no preemption 不可搶先
  不可搶奪別的 process 已經擁有的資源
4. circular wait 循環等待
  存有循環等待的情況

解決方法

Dead Lock Prevention
不會存有死結成因四條中的任何一條。
Dead Lock Avoidance
對運行的情況做模擬，找一個簡單的算法查看是否有解死結，但這算法可能會導致可解的情況，歸屬在不可解的情況。
- Use a. to get: $\sum Need + \sum Allocation < m + n$
- Use c. to get: $\sum Need_{i} + m < m + n$
- Rewrite to get: $\sum Need_{i} < n$
Dead Lock Detection & Recovery
Recovery 成本太高

小問題

Propose two approaches to protect files in a file system.
檔案系統如何保護檔案？請提供兩種方式。

對檔案做加密動作，或者將檔案藏起－不可讀取。
The system will enter deadlock if you cannot find a safe sequence for
it, YES or NO? Please draw a diagram to explain your answer.
如果找不到一個安全序列，是否就代表系統進入死結？

不會，因為找尋安全序列的方式通常是精心設計的通則，也就是說每個程序要求資源的方式都是可預期的運作，但實際上並不是如此，而且順序和運作都是各自獨立且不可預期，有可能途中就會釋放資源。
spinlock mutex semaphore 三者的區分？

spinlock(自旋鎖)是一種佔有 CPU 效能的一種 busy waiting，而 mutex 和 semaphore 是一種根據一個資料物件來達到互斥的效果。

就以 spinlock 來說，比較適合鎖較短的指令。相反地，mutex 和 semaphore 用於鎖住區域 (critical section) 的指令代碼。

回過頭看 mutex 和 semaphore，mutex 只能限制單一程序進入 critical section，而 semaphore 可以自訂有多少程序可以進度 critical section，就以 binary semaphore 來說，其功能效應等同於 mutex。

Read More +

2014-05-30

學校課程/作業系統

向 NachOS 4.0 作業進發 (2)

接續

接續上一篇的向 NachOS 4.0 作業進發 (1)，我們繼續完成排程處理，在原始的 NachOS 4.0 中，基礎排程是 Round-robin，它的運作方式藉由定時呼叫 threads/alarm.h 中的 Alarm::CallBack()，而呼叫的時脈來源設定為 machine/stats.h 中的 const int TimerTicks = 100; 決定，數值越小表示呼叫頻率越高，置換的次數就會上升。

目標 Scheduler

FIFO (FCFS) 先來先服務
SJF 最短工作優先
Priority 最小優先權優先
RR (Round-robin)
multi-level queue 最後的大魔王

開始

第一步 - 測試

先來說說怎麼測試自己的寫的代碼正確性，不然怎麼寫都只是理論腦補，雖然會動，但是不知道是否寫對。

第一種方式為下達

$ ./nachos -e ../test/test1 -e ../test/test2 ... 多個

這樣必須自己設置優先權、Burst Time，這方面撰寫方式有兩種，其一是來亂，其二是較為正式，而在 Burst Time 的計算又分成兩種，一是執行前已知、二是執行時猜測。

其一來亂寫法，

$ ./nachos -e ../test/test1 -e ../test/test2 ... 多個

不改變原本的的執行方式，直接在代碼中決策 thread fork 出來時所擁有的優先權和 Burst Time。

static int testPriority = 0;
void 
Thread::Fork(VoidFunctionPtr func, void *arg)
{
// morris add
    testPriority++;
    if(testPriority == 1)
        setPriority(64);
    else if(testPriority == 2)
        setPriority(128);
    else
        setPriority(64);
// end morris add

因此，您可能會寫出以上的代碼，當然我不建議這麼寫，雖然是一種測試方式，但這並不好，一開始就這麼錯得相當離譜。

其二寫法，增加下達參數，不過這些參數格式要去查閱一下他們的參數傳遞建造，我想會寫成大概的樣子為
```
$ ./nachos -e ../test/test1 -pw 64 -e ../test/test2  -pw 128 ... 
```
多個為了不增加工作量，這裡就沒有親自去實作。某方面來說，可能要寫很多 small program 進行測試，並且撰寫時還要預測會跑多久，這難度會稍微提高。

第二種方式按照軟工的寫法
來寫 test code 吧！如果細心一點，就會看到相關的 Class::SelfTest() 在很多 class 都有的，因此我們需要把 test code 寫到 SelfTest 中被呼叫。因此，不需要有實體的 process 運行，只是單純地測試我們寫的 scheduler 的排程順序是否正確。

threads/thread.cc

void
threadBody() {
    Thread *thread = kernel->currentThread;
    while (thread->getBurstTime() > 0) {
        thread->setBurstTime(thread->getBurstTime() - 1);
        kernel->interrupt->OneTick();
        printf("%s: remaining %d\n", kernel->currentThread->getName(), kernel->currentThread->getBurstTime());
    }
}
void
Thread::SchedulingTest()
{
    const int thread_num = 4;
    char *name[thread_num] = {"A", "B", "C", "D"};
    int thread_priority[thread_num] = {5, 1, 3, 2};
    int thread_burst[thread_num] = {3, 9, 7, 3};
    
    Thread *t;
    for (int i = 0; i < thread_num; i ++) {
        t = new Thread(name[i]);
        t->setPriority(thread_priority[i]);
        t->setBurstTime(thread_burst[i]);
        t->Fork((VoidFunctionPtr) threadBody, (void *)NULL);
    }
    kernel->currentThread->Yield();
}

請自行在 class Thread 增加 setPriority(), setBurstTime(), SchedulingTest() … 等 method header and body。

threads/thread.h

class Thread {
  private:
   ...
  public:
   ...
    // morris add
    void setBurstTime(int t)	{burstTime = t;}
    int getBurstTime()		{return burstTime;}
    void setStartTime(int t)	{startTime = t;}
    int getStartTime()		{return startTime;}
    void setPriority(int t)	{execPriority = t;}
    int getPriority()		{return execPriority;}
    static void SchedulingTest();
  private:
    // some of the private data for this class is listed above
    // morris add
    int burstTime;	// predicted burst time
    int startTime;	// the start time of the thread
    int execPriority;	// the execute priority of the thread
    ...
};

接著是需要 call testcode 加入到 ThreadedKernel 中。

threads/kernel.cc

void
ThreadedKernel::SelfTest() {
   Semaphore *semaphore;
   SynchList<int> *synchList;
   
   LibSelfTest();		// test library routines
   
   currentThread->SelfTest();	// test thread switching
   Thread::SchedulingTest();   
   				// test semaphore operation
   semaphore = new Semaphore("test", 0);
    ...
}

因此我們可以利用單一 module 就能進行檢查。

$ cd code/threads
$ ./nachos

但是這種方式，並沒有決定我們要測試哪一種類型的 scheduler，額外需要設定參數，讓其選擇建造哪一種的 readyList(Thread*)

$ cd code/threads
$ ./nachos FCFS
$ ./nachos SJF
$ ./nachos Priority
$ ./nachos RR

預計要修成如上述的測試方式。

threads/main.cc

int
main(int argc, char **argv)
{
    ...
    debug = new Debug(debugArg);
    
    DEBUG(dbgThread, "Entering main");
//
    SchedulerType type = RR;
    if(strcmp(argv[1], "FCFS") == 0) {
    type = FIFO;
    } else if (strcmp(argv[1], "SJF") == 0) {
    type = SJF;
    } else if (strcmp(argv[1], "PRIORITY") == 0) {
    type = Priority;
    } else if (strcmp(argv[1], "RR") == 0) {
    type = RR;
    }
//
    kernel = new KernelType(argc, argv);
    kernel->Initialize(type);
    
    CallOnUserAbort(Cleanup);		// if user hits ctl-C
    kernel->SelfTest();
    kernel->Run();
    
    return 0;
}

第一步測試撰寫就這麼結束，比起撰寫新的 code，不會進行測試也是徒勞。

雖然這麼說，實際運作也是先寫 code 再寫 test code，因為當時還不太懂這些，所以如果已經先寫了一些也不用難過，大家都是這麼走過來的。

如果不知道怎麼執行，哪還有什麼撰寫的欲望，如果不知道怎麼測試，那就是在黑暗中漫步。

第二步 - 撰寫

在撰寫開始前，如果使用走第一種測試方式的人，需要將實體記憶體用大一點，否則需要實作虛擬記憶體的 context-swith。

machine/machine.h

1
2
3

const unsigned int NumPhysPages = 64;  // morris modify, old value=32
const int MemorySize = (NumPhysPages * PageSize);
const int TLBSize = 4;			// if there is a TLB, make it small

若完成以上的修改，在 /userprog 下，執行

1	$ ./nachos -e ../test/test1 -e ../test/test2 -e ../test/test1

就不會跑出 segment fault 。

threads/scheduler.h

enum SchedulerType {
        RR,     // Round Robin
        SJF,
        Priority,
        FIFO
};
...
class Scheduler {
  public:
    Scheduler();
    Scheduler(SchedulerType type);		// Initialize list of ready threads 
    ...
    	// morris add
    SchedulerType getSchedulerType() {return schedulerType;}
    void setSchedulerType(SchedulerType t) {schedulerType = t;}
  private:
    ...
};

如果有機會查閱其他代碼，很常見會把 List<Thread *> *readyList; 改成 SortedList<Thread *> *readyList; 但是其實不用的，可以利用多形來完成，畢竟 SortedList 繼承 List。因此是不需要更動的。

接著，我們使用多形，在建構子中決定要用哪一種類型的排程，並且宣告相對應的 compare function，參照如下。

threads/scheduler.cc

int SJFCompare(Thread *a, Thread *b) {
    if(a->getBurstTime() == b->getBurstTime())
        return 0;
    return a->getBurstTime() > b->getBurstTime() ? 1 : -1;
}
int PriorityCompare(Thread *a, Thread *b) {
    if(a->getPriority() == b->getPriority())
        return 0;
    return a->getPriority() > b->getPriority() ? 1 : -1;
}
int FIFOCompare(Thread *a, Thread *b) {
    return 1;
}
//----------------------------------------------------------------------
// Scheduler::Scheduler
// 	Initialize the list of ready but not running threads.
//	Initially, no ready threads.
//----------------------------------------------------------------------
Scheduler::Scheduler() {
    Scheduler(RR);
}
Scheduler::Scheduler(SchedulerType type)
{
    schedulerType = type;
    switch(schedulerType) {
    case RR:
        readyList = new List<Thread *>;
        break;
    case SJF:
        readyList = new SortedList<Thread *>(SJFCompare);
        break;
    case Priority:
        readyList = new SortedList<Thread *>(PriorityCompare);
        break;
    case FIFO:
        readyList = new SortedList<Thread *>(FIFOCompare);
    }
    toBeDestroyed = NULL;
}

上述的寫法是因為沒有辦法宣告 default argument value for class。

如果需要搶先 (Preemptive) 設計，則在 Alarm::CallBack() 決定是否要呼叫 interrupt->YieldOnReturn() 查看是否有更需要優先的 process 要執行。

threads/alarm.cc

void 
Alarm::CallBack() 
{
    Interrupt *interrupt = kernel->interrupt;
    MachineStatus status = interrupt->getStatus();
    bool woken = _bedroom.MorningCall();
    kernel->currentThread->setPriority(kernel->currentThread->getPriority() - 1);
    if (status == IdleMode && !woken && _bedroom.IsEmpty()) {// is it time to quit?
        if (!interrupt->AnyFutureInterrupts()) {
        timer->Disable();	// turn off the timer
    }
    } else {			// there's someone to preempt
    if(kernel->scheduler->getSchedulerType() == RR ||
        kernel->scheduler->getSchedulerType() == Priority ) {
//		interrupt->YieldOnReturn();
        cout << "=== interrupt->YieldOnReturn ===" << endl;
        interrupt->YieldOnReturn();
        }
    }
}

在 Burst Time 計算上，如果採用運行時估計，可以在以下代碼中進行計算。kernel->stats->userTicks 是執行 user program 的時間累加器 (也存有一個 system program 的時間累加器，運行 thread context-free, thread scheduling … 等時間用途。)

threads/alarm.cc

void 
Alarm::WaitUntil(int x) {
    IntStatus oldLevel = kernel->interrupt->SetLevel(IntOff);
    Thread* t = kernel->currentThread;
    // burst time
    int worktime = kernel->stats->userTicks - t->getStartTime();
    t->setBurstTime(t->getBurstTime() + worktime);
    t->setStartTime(kernel->stats->userTicks);
    cout << "Alarm::WaitUntil go sleep" << endl;
    _bedroom.PutToBed(t, x);
    kernel->interrupt->SetLevel(oldLevel);
}

其實到這裡已經完成，接下來就放幾張測試結果。

multi-programming

FCFS(1)
FCFS(2)

RR(1)
RR(2)

SJF(1)
SJF(2)

Priority(1)

最後

如果在

1 2	$ cd code $ make

編譯時發生錯誤，想必是兩個地方的 Initialize() 發生錯誤，所以請參照以下代碼進行修改。這問題是發生於我們在 /threads 下修改 main.cc 的關係，所以必須在每一個 kernel 宣告地方都給一個決定 scheduler 類型的參數。

其一,

userprog/userkernel.h

#include "../threads/scheduler.h"
class SynchDisk;
class UserProgKernel : public ThreadedKernel {
  public:
  	...
    void Initialize();		// initialize the kernel 
    void Initialize(SchedulerType type);
    ...

其二,

userprog/userkernel.cc

void
UserProgKernel::Initialize()
{
    Initialize(RR);
}
void
UserProgKernel::Initialize(SchedulerType type)
{
    ThreadedKernel::Initialize(type);	// init multithreading
    machine = new Machine(debugUserProg);
    fileSystem = new FileSystem();
#ifdef FILESYS
    synchDisk = new SynchDisk("New SynchDisk");
#endif // FILESYS
}

其三,

network/netkernel.h

#include "../threads/scheduler.h"
class PostOfficeInput;
class PostOfficeOutput;
class NetKernel : public UserProgKernel {
  public:
    ...
    void Initialize();		// initialize the kernel 
    void Initialize(SchedulerType);
    ...

其四,

network/netkernel.cc

void 
NetKernel::Initialize() {
    Initialize(RR);
}
void
NetKernel::Initialize(SchedulerType type)
{
    UserProgKernel::Initialize(type);	// init other kernel data structs
    postOfficeIn = new PostOfficeInput(10);
    postOfficeOut = new PostOfficeOutput(reliability, 10);
}

致網路資源

感謝網路上前人們所遺留的遺產。雖然在看不懂的文字中摸索，但仍抓到了一點線索。

我一無所有，直到見到了你們。

Read More +

2014-05-25

學校課程/作業系統

作業系統 multiprocess & multithread 作業

A 班程式要求

請使用所熟悉的程式開發環境設計單一行程產生子行程 ( Child Process ) 與多執行緒 ( Multi-Thread) 的程式，意即完成下列傳統 for loop 之列印工作，將其改寫成多行程與多執行緒，由不同的行程與執行緒列印不同 i 值。

1 2	for (int i = 0; i < 10; i++) printf("output i = %d\n", i);

注意事項

嚴禁從網路抄襲作業 。每份程式作業將被程式比對軟體檢查，一旦發現抄襲，該份作業以零分計，並且這學期作業系統作業成績以零分計。
後略

這種作業不抄來抄去也難。但是作業格式不清楚，所以也有不同理解方式就是了。

開始動手

對於將 for 迴圈改寫，我們必須認知到幾項

是否要按照順序輸出？
平台 M$, Linux, or Max OS？

此篇假設目標是一模一樣的輸出

multiprocess

child process 屬於 multiprocess 中的一種方式。為什麼這麼說呢？multiprocess 只是要求一個 CPU 中有多個工作同時在進行，藉由排程器去進行工作分配。

先不管這些，要實作 child process 看起來只有走向 fork() 一途。而之前 inker 大神則是走非 child process 使用類似 ./nachos -e ../test/test1 -e ../test/test1 的方式去運行。

為了實作按照順序輸出，我們需要 process 共同擁有的 shared memory，這個 shared memory 將要紀錄原本 for 裡面的 i 值和一個 sem_t，來保證 process 和 process 之間不會同時搶先動作，sem_t 也許是不需要的，這裡留給實驗者去測試。

為了使輸出一致，在 fork() 完之後，使用 wait() 來等待所有子程序結束。這樣會造成當 for loop 需要跑 n 次，就會總共 n 個 process，也就是一整條鏈下去。

思考 wait() 和 sem_t 都不是這麼必要？// 這個要看寫法

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <semaphore.h>
#ifndef  MAP_ANONYMOUS
#ifdef  MAP_ANON
#define MAP_ANONYMOUS MAP_ANON
#endif
#endif
static int *glob_var;
static int n = 32;
int main(void)
{
    glob_var = (int *)mmap(NULL, sizeof *glob_var, PROT_READ | PROT_WRITE,
                    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    
    sem_t *sema = (sem_t *)mmap(NULL, sizeof(sema),
                       PROT_READ |PROT_WRITE,MAP_SHARED|MAP_ANONYMOUS,
                       -1, 0);
    sem_init(sema, 1, 0);
    *glob_var = 0;
    while (*glob_var < n) {
        sem_wait(sema);
        int pid = fork();
        int output = (*glob_var)++;
        if (output < n)
            printf("%d, pid = %d\n", output, pid);
        sem_post(sema);
        int wpid, status = 0;
        while ((wpid = wait(&status)) > 0)
        {
            //printf("Exit status of %d was %d (%s)\n", (int)wpid, status, (status > 0) ? "accept" : "reject");
        }
    }
    munmap(glob_var, sizeof *glob_var);
    return 0;
}

multithread

Thread 就相當簡單，因為它們是共用同一個記憶體區段，因此沒有 share memory 的問題。
雖然使用 sem_wait(&sem) 和 sem_post(&sem) 來保證區域內的代碼不會同時有多個 thread 在運行，但仍遇到 i = 0 同時在多個 Thread 噴出的情況，這一點百思不得其解，為了解決這一切問題，使用課本提到的 Producer–consumer problem 寫法。

希望能從代碼中看懂。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>
#include <pthread.h>
int i = 0, n = 16;
int count = 0;
sem_t sem;
const int MAX_THREAD = 4;
void *Producer(void *arg) {
    for(int i = 0; i < n;) {
        sem_wait(&sem);
        if(count == 1) {
            sem_post(&sem);
            continue;
        }
        count++;
        // printf("\nProducer is :%d\n", count);
        sem_post(&sem);
        i++;
    }
    pthread_exit(NULL);
}
void *runPrint(void *arg) {
    int id = *((int *) arg);
    for (; i < n ;) {
        sem_wait(&sem);
        if (count == 0) {
            sem_post(&sem);
            continue;
        }
        count--;
        i++;
        printf("loop i = %d, thread id %d\n", i, id);
        sem_post(&sem);
    }
    pthread_exit(NULL);
}
int main() {
    pthread_t *threads;
    sem_init(&sem, 0, 0);
    threads = (pthread_t *) malloc(MAX_THREAD * sizeof(pthread_t));
    for (int i = 0; i < MAX_THREAD; i++) {
        int *p = (int *) malloc(sizeof(int));
        *p = i;
        pthread_create(&threads[i], NULL, runPrint, (void *)(p));
    }
    pthread_t producer;
    int *p = (int *) malloc(sizeof(int));
    *p = -1;
    pthread_create(&producer, NULL, Producer, (void *)(p));
    for (i = 0; i < MAX_THREAD; i++) {
        pthread_join(threads[i], NULL);
    }
    sem_destroy(&sem);
    return 0;
}

後記

代碼沒有寫得很完整，也就是能達到輸出相同且用符合需求，這一點是有很多漏洞的，前人也是鑽漏洞交作業的。相較之下，我們 B 班的作業略顯兇殘，但是網路上資料還真是少呢，雖然還是找地方抄和理解。

測試的時候，為了保證輸出是穩定的，記得得多測試幾次，並且把數字範圍作變調，當然能不能讓一個 create thread 跑完整個迴圈呢？畢竟加上 main thread 就算是 multithread？haha

執行

編譯

Mac OSX
```
$ clang -pthread x.cpp -o x
```
Linux
```
$ gcc -lpthread x.cpp -o x
```
運行
Linux & Max OSX
```
$ ./x
```

Mac 運行問題

#ifndef  MAP_ANONYMOUS
#ifdef  MAP_ANON
#define MAP_ANONYMOUS MAP_ANON
#endif
#endif

不知道為什麼 clang 中沒有定義 MAP_ANONYMOUS，增加上述代碼後正確運行。

其他作業

B 班程式作業 1
B 班程式作業 2

Read More +

2014-05-24

學校課程/作業系統

向 NachOS 4.0 作業進發 (1)

關於 NachOS 4.0

作為教學用作業系統，需要實現簡單並且儘量縮小與實際作業系統之間的差距，所以我們採用 Nachos 作為作業系統課程的教學實踐平臺。Nachos 是美國加州大學伯克萊分校在作業系統課程中已多次使用的作業系統課程設計平臺，在美國很多大學中得到了應用，它在作業系統教學方面具有一下幾個突出的優點：

採用通用虛擬機器
簡單而易於擴展
物件導向性
… 略

簡單地來說，就是一個模擬作業系統的軟體

安裝

課程頁面：http://networklab.csie.ncu.edu.tw/2014_os/osweb.html#download

Platform: Linux or Linux over VMware
```
RedHat Linux 9.0 
```
在 Ubuntu 上面運行無法

Install steps

Get NachOS-4.0
```
wget ../nachos-4.0.tar.gz
```
Get Cross Compiler
```
wget ../mips-decstation.linux-xgcc.tgz
```
Move Cross Compiler to /
```
mv ./mips-decstation.linux-xgcc.tgz /
```

Untar Cross Compiler

tar zxvf ./mips-decstation.linux-xgcc.tgz

Untar NachOS-4.0
```
tar zxvf ./nachos-4.0.tar.gz
```
Make NachOS-4.0
```
cd ./nachos-4.0/code
make
```

Test if installation is succeeded

cd ./userprog
./nachos -e ../test/test1
./nachos -e ../test/test2

預期運行結果

Total threads number is 1
Thread ../test/test1 is executing.
Print integer:9
Print integer:8
Print integer:7
Print integer:6
return value:0
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!
                                                                                
Ticks: total 200, idle 66, system 40, user 94
Disk I/O: reads 0, writes 0
Console I/O: reads 0, writes 0
Paging: faults 0
Network I/O: packets received 0, sent 0
Total threads number is 1
Thread ../test/test2 is executing.
Print integer:20
Print integer:21
Print integer:22
Print integer:23
Print integer:24
Print integer:25
return value:0
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!
Ticks: total 200, idle 32, system 40, user 128
Disk I/O: reads 0, writes 0
Console I/O: reads 0, writes 0
Paging: faults 0
Network I/O: packets received 0, sent 0

作業階段

Multiprogramming

執行下述指令

1	./nachos –e ../test/test1 –e ../test/test2

test1.c

#include "syscall.h"
main() {
    int n;
    for (n=9;n>5;n--)
        PrintInt(n);
}

test2.c

#include "syscall.h"
main() {
    int n;
    for (n=20;n<=25;n++)
        PrintInt(n);
}

在開始動工前，發現在 Multithread 操作會有錯誤，test1 程序會遞減輸出，而 test2 程序會遞增輸出，但是根據運行結果，兩個程式都會遞增輸出。

目標運行結果如下：

Total threads number is 2
Thread ../test/test1 is executing.
Thread ../test/test2 is executing.
Print integer:9
Print integer:8
Print integer:7
Print integer:20
Print integer:21
Print integer:22
Print integer:23
Print integer:24
Print integer:6
return value:0
Print integer:25
return value:0
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!
 
Ticks: total 300, idle 8, system 70, user 222
Disk I/O: reads 0, writes 0
Console I/O: reads 0, writes 0
Paging: faults 0
Network I/O: packets received 0, sent 0

為什麼兩個程式都會遞增輸出？
明顯地是兩個程式匯入時，操作到相同區域的 code segment。如果發現各別遞增和遞減，但是輸出 n 的結果錯誤，則可以想到是 context-switch 上發生問題。

解決

需要為 physical address 做標記，這個 physical address 就是在 main memory 裡面的位置。而每一個 process 都會有一個 AddrSpace，然後紀錄 logical address 所對應在 physical address，也就是 pageTable 對應到 pageTable[i].physicalPage。

也就是說，當要執行 process 某一頁的程序，則將會去找 pageTable 所對應的 physicalPage，然後去運行。在作業一開始給的代碼中，會發現所有程序的 physicalPage 都共用，如此一來當然會執行到同一份 code。

首先，我們需要一個全局的標記，來記錄 main memory page 的情況。

usrprog/addrspace.h

class AddrSpace {
  public:
    ...
    static bool usedPhyPage[NumPhysPages];
  private:
    ...
};

接著，在 addrspace.cc 宣告實體，並且初始化是 0，表示全部都還沒有用過。

usrprog/addrspace.cc

#include "copyright.h"
#include "main.h"
#include "addrspace.h"
#include "machine.h"
#include "noff.h"
    ...
bool AddrSpace::usedPhyPage[NumPhysPages] = {0};
    ...

當 process 執行成功後，將會釋放資源。此時將要把標記使用的 page 設回未使用。

usrprog/addrspace.cc

AddrSpace::~AddrSpace()
{
    for(int i = 0; i < numPages; i++)
        AddrSpace::usedPhyPage[pageTable[i].physicalPage] = false;
   delete pageTable;
}

最後，就是稍微麻煩的地方，當將程序載入記憶體時，要去填入 pageTable[] 對應的 physicalPage，這裡使用線性時間去搜索第一個未使用的 page 進行填入。

載入確定後，便要開始執行，執行的時候要去算程序進入點，進入點的位置即是 main memory address。

mainMemory[pageTable[noffH.code.virtualAddr/PageSize].physicalPage * PageSize + (noffH.code.virtualAddr%PageSize)]

首先算出第幾頁，然後乘上 PageSize 就是 page base，而 page offset 則是 code.address mod PageSize。

最後，page base + page offset 就是我們所需要的程序進入點。

usrprog/addrspace.cc

bool AddrSpace::Load(char *fileName) {
    OpenFile *executable = kernel->fileSystem->Open(fileName);
    NoffHeader noffH;
    unsigned int size;
    if (executable == NULL) {
    cerr << "Unable to open file " << fileName << "\n";
    return FALSE;
    }
    executable->ReadAt((char *)&noffH, sizeof(noffH), 0);
    if ((noffH.noffMagic != NOFFMAGIC) && 
        (WordToHost(noffH.noffMagic) == NOFFMAGIC))
    	SwapHeader(&noffH);
    ASSERT(noffH.noffMagic == NOFFMAGIC);
// how big is address space?
    size = noffH.code.size + noffH.initData.size + noffH.uninitData.size 
            + UserStackSize;	// we need to increase the size
                        // to leave room for the stack
    numPages = divRoundUp(size, PageSize);
//	cout << "number of pages of " << fileName<< " is "<< numPages << endl;
// morris add
    pageTable = new TranslationEntry[numPages];
    for(unsigned int i = 0, j = 0; i < numPages; i++) {
        pageTable[i].virtualPage = i;
        while(j < NumPhysPages && AddrSpace::usedPhyPage[j] == true)
            j++;
        AddrSpace::usedPhyPage[j] = true;
        pageTable[i].physicalPage = j;
        pageTable[i].valid = true;
        pageTable[i].use = false;
        pageTable[i].dirty = false;
        pageTable[i].readOnly = false;
    }
// end morris add
    size = numPages * PageSize;
    ASSERT(numPages <= NumPhysPages);		// check we're not trying
                        // to run anything too big --
                        // at least until we have
                        // virtual memory
    DEBUG(dbgAddr, "Initializing address space: " << numPages << ", " << size);
// then, copy in the code and data segments into memory
    if (noffH.code.size > 0) {
        DEBUG(dbgAddr, "Initializing code segment.");
    DEBUG(dbgAddr, noffH.code.virtualAddr << ", " << noffH.code.size);
// morris add
        	executable->ReadAt(
        &(kernel->machine->mainMemory[pageTable[noffH.code.virtualAddr/PageSize].physicalPage * PageSize + (noffH.code.virtualAddr%PageSize)]), 
            noffH.code.size, noffH.code.inFileAddr);
// end morris add
    }
    if (noffH.initData.size > 0) {
        DEBUG(dbgAddr, "Initializing data segment.");
    DEBUG(dbgAddr, noffH.initData.virtualAddr << ", " << noffH.initData.size);
// morris add
        executable->ReadAt(
        &(kernel->machine->mainMemory[pageTable[noffH.initData.virtualAddr/PageSize].physicalPage * PageSize + (noffH.code.virtualAddr%PageSize)]),
            noffH.initData.size, noffH.initData.inFileAddr);
// end morris add
    }
    delete executable;			// close file
    return TRUE;			// success
}

Sleep() System call

撰寫自己的 Sleep function，將該 Thread 進入休眠。

測試檔案

與一開始的 test1.c 很像，這裡先做個簡單測試。

/code/test/sleep.c

#include "syscall.h"
main() {
    int i;
    for(i = 0; i < 5; i++) {
        Sleep(10000000);
        PrintInt(50);
    }
    return 0;
}

因為整個測試檔案是要能在 nachOS 上面跑，所以不能使用一般的編譯方式？所以把這些訊息寫入 Makefile 中，之後下 $ make 產出 sleep 即可。

/code/test/Makefile

...
all: halt shell matmult sort test1 test2 sleep
...
sleep: sleep.o start.o
    <span>$(LD)$</span><!-- Has MathJax -->(LDFLAGS) start.o sleep.o -o sleep.coff
    ../bin/coff2noff sleep.coff sleep
...

如果無法在 /code/test 下 $ make 順利產出，則請參照 其他指令 ，將原本下載的 mips-decstation.linux-xgcc.gz 解壓縮產生的 /usr 移動到正確位置。

解決

首先，要模仿 PrintInt()，找到切入的宣告位置。努力地尾隨前人。

/code/userprog/syscall.h

...
#define SC_PrintInt	11
#define SC_Sleep	12
...
void PrintInt(int number);	//my System Call
void Sleep(int number);         // morris add
...

接著，找一下 ASM。
以下部分就不得而知了，為什麼會出現在 /code/test 目錄下。

/code/test/start.s

    ...
PrintInt:
    addiu   <span>$2,$</span><!-- Has MathJax -->0,SC_PrintInt
    syscall
    j       $31
    .end    PrintInt
    .globl  Sleep
    .ent	Sleep
Sleep:
    addiu   <span>$2,$</span><!-- Has MathJax -->0,SC_Sleep
    syscall
    j	$31
    .end	Sleep
    ...

最後，確定 nachOS 會處理我們的例外操作。

/code/userprog/exception.cc

    ...
case SC_PrintInt:
    val=kernel->machine->ReadRegister(4);
    cout << "Print integer:" << val << endl;
    return;
case SC_Sleep:
    val=kernel->machine->ReadRegister(4);
    cout << "Sleep Time " << val << "(ms) " << endl;
    kernel->alarm->WaitUntil(val);
    return;
    ...

到這裡需要喘口氣，剛找了一陣子的代碼。接著才是重頭戲，因為你看到了 kernel->alarm->WaitUntil(val); 出現了。沒錯！我們需要撰寫那邊的代碼。

既然 kernel 存有 alarm，也就是說每隔固定一段間，就會呼叫 Alarm::CallBack()，因此，對於這個鬧鐘來個累加器 Bedroom::_current_interrupt，全局去記數，當作毫秒 (ms) 看待就可以，每加一次就相當於過了 1 毫秒 (ms)。

假設現在某一個 time = _current_interrupt 呼叫 Sleep(x) 後，將 (Thread address, _current_interrupt + x) 丟入 List，則期望在 _current_interrupt + x 的時候甦醒。因此，每當 _current_interrupt++ 就去檢查 List 中，誰的預期時間小於 _current_interrupt，就將其從 List 中清除。

這裡先不考慮 _current_interrupt Overflow 的問題，根據電腦效率，可能在 5 ~ 10 秒內就發生一次，此時會造成所有 Thread 沉眠很久。若察覺 Sleep 太久，則是發生 Overflow。

當有程序呼叫 Sleep() 時，會呼叫 WaitUntil()，然後將其丟入 Bedroom 安眠。
然後在 CallBlack() 被呼叫時，來去 Bedroom 檢查誰該醒來了。

/code/threads/alarm.h

#include "copyright.h"
#include "utility.h"
#include "callback.h"
#include "timer.h"
#include <list>
#include "thread.h"
class Bedroom {
    public:
        Bedroom():_current_interrupt(0) {};
        void PutToBed(Thread *t, int x);
    bool MorningCall();
    bool IsEmpty();
    private:
        class Bed {
            public:
                Bed(Thread* t, int x):
                    sleeper(t), when(x) {};
                Thread* sleeper;
                int when;
        };
    
    int _current_interrupt;
    std::list<Bed> _beds;
};
// The following class defines a software alarm clock. 
class Alarm : public CallBackObj {
  public:
    Alarm(bool doRandomYield);	// Initialize the timer, and callback 
                // to "toCall" every time slice.
    ~Alarm() { delete timer; }
    
    void WaitUntil(int x);	// suspend execution until time > now + x
  private:
    Timer *timer;		// the hardware timer device
    Bedroom _bedroom;
    void CallBack();		// called when the hardware
                // timer generates an interrupt
};

確定有哪些 method 就可以開始實作了。

/code/threads/alarm.cc

void Alarm::CallBack() {
    Interrupt *interrupt = kernel->interrupt;
    MachineStatus status = interrupt->getStatus();
    bool woken = _bedroom.MorningCall();
    
    if (status == IdleMode && !woken && _bedroom.IsEmpty()) {// is it time to quit?
        if (!interrupt->AnyFutureInterrupts()) {
        timer->Disable();	// turn off the timer
    }
    } else {			// there's someone to preempt
    interrupt->YieldOnReturn();
    }
}
void Alarm::WaitUntil(int x) {
    IntStatus oldLevel = kernel->interrupt->SetLevel(IntOff);
    Thread* t = kernel->currentThread;
    cout << "Alarm::WaitUntil go sleep" << endl;
    _bedroom.PutToBed(t, x);
    kernel->interrupt->SetLevel(oldLevel);
}
bool Bedroom::IsEmpty() {
    return _beds.size() == 0;
}
void Bedroom::PutToBed(Thread*t, int x) {
    ASSERT(kernel->interrupt->getLevel() == IntOff);
    _beds.push_back(Bed(t, _current_interrupt + x));
    t->Sleep(false);
}
bool Bedroom::MorningCall() {
    bool woken = false;
    _current_interrupt ++;
    for(std::list<Bed>::iterator it = _beds.begin(); 
        it != _beds.end(); ) {
        if(_current_interrupt >= it->when) {
            woken = true;
            cout << "Bedroom::MorningCall Thread woken" << endl;
            kernel->scheduler->ReadyToRun(it->sleeper);
            it = _beds.erase(it);
        } else {
            it++;
        }
    }
    return woken;
}

到這裡我們就完成 Sleep() 製作。

在下一階段，將要加入排程的製作。

Scheduling

目標完成 SJF, RR, FIFO。
請參閱 “向 NachOS 4.0 作業進發 (2)”

其他指令

打包轉移 // 交作業前，將檔案包出傳送，此指令不是壓縮用途
1
$ tar cvf FileName.tar DirName
複製資料夾 // 將 nachos gcc library 搬移至 /usr/local/nachos ...
1
$ cp -a DirName TARGET_PATH
轉移至 root 權限 // 如果指令權限不夠，需要下達此指令。
1
$ su

備忘

網路上有幾個 Java 版本的 NachOS，架構稍微不同，如果發現文章說明有點怪怪的，請看一下他是用 JAVA 版本還是 C++ 版本。
至於怎麼將檔案從 Red Hat，這有點惱人，可以想辦法安裝其他瀏覽器，然後開 e-mail 送出，或者使用 terminal 上的 wget 安裝 … 一堆辦法。最後我自己用 nodejs 寫一個簡單的 upload server 傳出來。

代碼下載

Github

Read More +

2014-05-18

學校課程/作業系統

作業系統 Thread 程式作業

C++ Thread 練習

作業要求

作業一:應用 Thread 撰寫任意程式(語言不限)，繳交方式 FTP:140.115.155.192(選擇自己學號的資料夾上傳) 帳:os2014 密碼不知道者請再問助教，檔案名稱: “作業一學號姓名_第幾版” ex:”作業一10252200x林大乖_v1” 繳交內容:壓縮檔(報告word 以及程式碼,執行檔)，截止日期:2014/05/25(日)23:59

程式功能

兩個 n*n 矩陣相乘，在最基礎的 O(n^3) 效能上，使用 Thread 等相關技巧使其加速。

運行

Mac OSX
```
$ clang -pthread x.cpp -o x
```
Linux
```
$ gcc -lpthread x.cpp -o x
```

測試效率

Mac OSX and Linux
```
$ time ./a.out
```

測試結果

測試環境為 2013 Mac Air 1.3 GHz Intel Core i5 上對 1024 x 1024 的矩陣進行相乘的統計結果。

matrix.cpp
最原始的一般寫法，直接使用 O(n^3) 。運行時間約在 20 秒左右完成。
matrix[v2].cpp
使用轉置矩陣後，在進行運算，這種方式是加快作業系統中的快取層，速度提升至約在 4 秒左右完成。
matrix[v3].cpp
建立在 matrix[v2].cpp 的基礎上，使用 4 條 thread，分別對處理的 column 進行分割操作，速度約在 2 秒內完成。
matrix[v4].cpp
與 matrix[v3].cpp 類似，調整 thread 的數量，查看效率變化，但是速度沒有更快的趨勢。

結論

Thread 多不代表會比較快，而是該程式佔有 CPU 的機會比較高，使用的資源就會比較多。在整個作業系統中，Thread 多上下文切換的次數就會上升，對於作業系統整體效率是下降的，但是又不能沒有 Thread，看起來像是所有程式同時在運行，也要防止落入死機無法動彈的情況。

而在 thread 上面會發現，當數量達到一定程度後，速度就不會上升，有一部分是因為 CPU 超頻運作。

超頻（英語：Overclocking）是把一個電子配件的時脈速度提升至高於廠方所定的速度運作，從而提升性能的方法，但此舉有可能導致該配件穩定性下降。

可能是長時間閒置的 CPU 發現有大規模的工作運行，把時脈速度升起。

代碼

matrix.cpp

#define MAX_N 1024
double A[MAX_N][MAX_N];
double B[MAX_N][MAX_N];
double C[MAX_N][MAX_N];
void multiply(double A[][MAX_N], double B[][MAX_N], double C[][MAX_N]) {
    for (int i = 0; i < MAX_N; i++) {
        for (int j = 0; j < MAX_N; j++) {
            double sum = 0;
            for (int k = 0; k < MAX_N; k++) {
                sum += A[i][k] * B[k][j];
            }
            C[i][j] = sum;
        }
    }
}
int main(void) {
    //clock_t start, finish;
    //start = clock(); // start time clock
    for (int i = 0; i < MAX_N; i++) {
        for (int j = 0; j < MAX_N; j++) {
            A[i][j] = i;
            B[i][j] = j;
        }
    }
    multiply(A, B, C);
    //finish = clock(); // finish time clock
    //printf("Time: %lf\n", (double)(finish - start)/1000);
    return 0;
}

matrix[v2].cpp

#include <stdio.h>
#include <stdlib.h>
#include <algorithm>
#include <time.h>
using namespace std;
#define MAX_N 1024
double A[MAX_N][MAX_N];
double B[MAX_N][MAX_N];
double C[MAX_N][MAX_N];
void trans(double A[][MAX_N]) {
    for (int i = 0; i < MAX_N; i++) {
        for (int j = i+1; j < MAX_N; j++) {
            swap(A[i][j], A[j][i]);
        }
    }
}
void multiply(double A[][MAX_N], double B[][MAX_N], double C[][MAX_N]) {
    trans(B);
    for (int i = 0; i < MAX_N; i++) {
        for (int j = 0; j < MAX_N; j++) {
            double sum = 0;
            for (int k = 0; k < MAX_N; k++) {
                sum += A[i][k] * B[j][k];
            }
            C[i][j] = sum;
        }
    }
}
int main(void) {
    //clock_t start, finish;
    //start = clock(); // start time clock
    for (int i = 0; i < MAX_N; i++) {
        for (int j = 0; j < MAX_N; j++) {
            A[i][j] = i;
            B[i][j] = j;
        }
    }
    multiply(A, B, C);
    //finish = clock(); // finish time clock
    //printf("Time: %lf\n", (double)(finish - start)/1000);
    return 0;
}

matrix[v3].c

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>
#define MAX_N 1024
double A[MAX_N][MAX_N];
double B[MAX_N][MAX_N];
double C[MAX_N][MAX_N];
void trans(double A[][MAX_N]) {
    double x;
    int i, j;
    for (i = 0; i < MAX_N; i++) {
        for (j = i+1; j < MAX_N; j++) {
            x = A[i][j];
            A[i][j] = A[j][i];
            A[j][i] = x;
        }
    }
}
#define MAX_THREAD 4
void *multiply(void *arg) {
    int id = *((int *)arg);
    int st = id * MAX_N / MAX_THREAD;
    int ed = (id + 1) * MAX_N / MAX_THREAD - 1;
    int i, j, k;
    for (i = st; i <= ed; i++) {
        for (j = 0; j < MAX_N; j++) {
            double sum = 0;
            for (k = 0; k < MAX_N; k++) {
                sum += A[i][k] * B[j][k];
            }
            C[i][j] = sum;
        }
    }
    return NULL;
}
int main(void) {
    puts("Thread version");
    //clock_t start, finish;
    //start = clock(); // start time clock
    int i, j;
    for (i = 0; i < MAX_N; i++) {
        for (j = 0; j < MAX_N; j++) {
            A[i][j] = i;
            B[i][j] = j;
        }
    }
    trans(B);
    
    pthread_t *threads;
    threads = (pthread_t *) malloc(MAX_THREAD * sizeof(pthread_t));
    
    for (int i = 0; i < MAX_THREAD; i++) {
        int *p = (int *) malloc(sizeof(int));
        *p = i;
        pthread_create(&threads[i], NULL, multiply, (void *)(p));
    }
    for (i = 0; i < MAX_THREAD; i++) {
        pthread_join(threads[i], NULL);
    }
    //finish = clock(); // finish time clock
    //printf("Time: %lf\n", (double)(finish - start)/1000);
    return 0;
}

其他

雖然作業可以有其他語言的相關 Thread，在 Java 作品方面可以查看 Github。

Read More +

2014-04-16

學校課程/作業系統

作業系統筆記 (1)

讀物

現在恐龍書不知道第幾版了，而且中文翻譯有點慘。看英文原文又太慢的情況，剛好查到聯合大學陳士杰教授寫的 ppt。繁體中文的說明，採用恐龍書第七版的內容。

下附章節簡報和課後習題解答。

稍微看過每一章的內容後，就可以著手看習題解答。

CH 1
CH 2
CH 3
CH 4
CH 5
CH 6
CH 7
CH 8
CH 9
CH 10

恐龍書課後習題解答感謝大大的提供

之前也有好幾次參照教授寫的中文簡報，相當有助益。

可能遭遇的問題

個人拙見，請自行過濾

Round-Robin Scheduling

這個問題在寫考題的時候，略顯著的疑惑。
對於 Round-Robin Scheduling 的部分
1
2
3
4
5
6
7
8
9
10
queue Q
while(true) {
w = Q.front()
w = work(w)
if(hasProcess())
Q.push(nextPorcess())
if(w != null)
Q.push(w)
time++
}
詳見代碼後，由於超過時限後，還要將變數存起來，所以丟入 queue 的時間會晚一點，也就是說在同一時刻會慢於新抵達的 process。之後的程序就按照 queue 的順序去運行。

考古題(課本後習題)

(=ﾟωﾟ)= 下附解答僅個人拙見，請勿隨意採信，請發揮您的善心，救救笨蛋

95 期中考

What is multiprogramming ? what is time-sharing ?(10%)
- Multiprogramming(多重程式)：
  - 定義：
    內存有多個 processs 同時執行，一般電腦的運行方式。透過 CPU scheduling 將發生中斷 (eg. wait for I/O complete, resource not available, etc.) 的行程切換給可執行的運作。
  - 特色：
    可以避免 CPU idle，提高 CPU utilization(使用率)
  - 其他：
    1. Multiprogramming Degree：系統內存在執行的 process 數目
    2. 通常 Multiprogramming Degree 越高，則 CPU utilization 越高(p.s. ch7 Thrashing 除外)
    3. 多個 process 同時執行，mode 有兩種 Concurrent(並行)、Parallel(平行)
- Time-sharing:
  - 定義：
    Multiprogramming 的一種，在 CPU 排班法則方面，其使用 RR(Round-Robin) 法則。
    // RR 法則－CPU time quantum，若 process 在取得 CPU 後，未能於 quantum
    // 內完成工作，則必須被迫放棄 CPU，等待下一次輪迴。
  - 特色：
    對每個 user 皆是公平的，適用在 user interactive 且 response time (反應時間)要求較短的系統環境，透過 resource sharing 技術(eg. CPU scheduling, memory sharing, spooling 達到 I/O Device 共享)，使得每個user皆認為有專屬的系統存在。
Describe the differences between symmetric and asymmetric multiprocessing. What are three advantages and one disadvantage of multiprocessor systems ?
In Distributed System - Multiprocessor - Tightly-Coupled System(different from Loosely-Coupled Distributed System)
- Symmetric Multiprocessors (SMP) 對稱式多處理器
  - 定義：
    每個 processor 的能力皆相同，即可負責的功能完全一樣。萬一某個 processor 壞了，其上工作可由其他processor 接手，系統不會整個 crash，只是整體效能下降而已。
  - 特色：
    Reliability (可靠性)大幅提升，強調 load balancing (負載平衡)(每個CPU的工作負擔相同)
- Asymmetric Multiprocessors (ASMP) 非對稱式多處理器
  - 定義：
    不同(群)顆的 processor，其負擔的功能不盡相同，在此架構下，通常含有一個Master CPU，負責工作的分派與協調，其他 CPUs 稱為 Slave CPU (稱為 Master-Slave Architecture)
  - 特色：
    效能通常較SMP高，但可靠度較差
What are the differences between a trap and an interrupt? What is the use of each function ?
- Trap:
  軟體產生 system call
  an exception in a user process.
  ex. system call (software divide-by-zero)
- Interrupt:
  硬體產生 signal
  something generated by the hardware.
  ex. IO-complete interrupt (hardware generated)
- 兩者相同之處
  從 User Mode 切換到 Kernel Mode。
  兩者都算是 interrupt 中斷指令。
What is the purpose of system calls?
System calls allow user-level processes to request services of the operating system.
System calls 提供 使用者級別 的程序來請求作業系統給的服務。
// process control, file Management, device management, information maintenance, communication
Describe the differences among short-term, medium-term, and long-term scheduling.
- short-term (CPU scheduler) —
  CPU 工作排程
  selects from jobs in memory those jobs that are ready to execute and allocates the CPU to them.
- medium-term —
  常用於分時系統的排程器。當記憶體不夠的時候，需要暫時將部分的執行程序搬出，將新來的程序載入，置換機制被建置用來移除部份在記憶體執行中及恢復那些被暫時移除的程式。
  used especially with time-sharing systems as an intermediate scheduling level. A swapping scheme is implemented to remove partially run programs from memory and reinstate them later to continue where they left off.
- long-term (job scheduler) —
  將 job 變成 process 的工作排程，將工作從硬碟載入到主記憶體中。
  determines which jobs are brought into memory for processing.
Can a multithreaded solution using multiple user-level threads achieve better performance on a multiprocessor system than on a single-processor system? Why?
- A multithreaded system comprising of multiple user-level threads cannot make use of the different processors in a multiprocessor system simultaneously. The operating system sees only a single process and will not schedule the different threads of the process on separate processors. Consequently, there is no performance benefit associated with executing multiple user-level threads on a multiprocessor system.
- 以 user-level hreads 實現 multithread 的程式，在多處理器系統中不能比在單處理機系統中有更好的效率，因在多處理機系統中之作業系統不會將多個 CPU 同時分配給該 user-level multithread 程式。
Consider the following set of processes, with the length of the CPU-burst time given in milliseconds:
```
Process           Burst Time         Priority
  P1                 10                  3
  P2                  1                  1
  P3                  2                  3
  P4                  1                  4
  P5                  5                  2
```
The processes are assumed to have arrived in the order P1,P2,P3,P4,P5, all at time 0.
- (a) Draw four Gantt charts illustrating the execution of these processes using FCFS, SJF, a nonpreempitive priority(a small priority number implies a higher priority), and RR(quantum = 1) scheduling.(5%)
- (b) What is the turnaround time of each process for each of the scheduling algorithms in part(a).(5%)
- (c) What is the waiting time of each process for each of the scheduling algorithms in part(a).(5%)
- (d) Which of the schedules in part (a) results in the minimal average waiting time(over all processes)? (5%)
```
請參照上附的習題解法。
```

96 期中考

FCFS, SJF, Priority, Round-Robin and Multilevel Queue
Which can be preemptive ? (5%)
- preemptive 可搶先
  SRTF (剩餘最短優先)、Priority (優先權優先)、Multilevel Queue (多層佇列)、RR
- non-preemptive 不可搶先
  FCFS (先到先服務)、SJF (最短工作優先)、Priority (優先權優先)
- Priority (優先權優先) 屬於可搶先跟不可搶先。
- 可搶先的定義為非自願性被踢出 CPU。
What are the four main purposes of an operating system ? (10%)
- Managing programs
- Managing Memory
- Handling input and output
- User Interface
Please give a detail description on the differences among the terms multiprocessing, multitasking and multithreading. (15%)
- multiprocessing
  具有多個處理器，在同一時間可以在各自處理器上運行程序。
- multitasking
  執行單一使用者的多個行程，利用排程器讓每一個行程都在平均時間內解決。
- multithreading
  可以將程序切成好幾個執行緒，若硬體可以同時執行多個執行緒，效率將會上升，如果不能，由於速度夠快看起來也很像多個執行緒同時運行。
What are the differences between a trap and an interrupt ? What is the use if each function ? (20%)
- Trap:
  軟體產生 system call
  an exception in a user process.
  ex. system call (software divide-by-zero)
- Interrupt:
  硬體產生 signal
  something generated by the hardware.
  ex. IO-complete interrupt (hardware generated)
- 兩者相同之處
  從 User Mode 切換到 Kernel Mode。
  兩者都算是 interrupt 中斷指令。
What is the purpose of system calls, Please describe three general methods that be used to pass parameters to the operating system. (10%).
- System call 由作業系統提供 API interface 給 user program，可以藉由 system call 向作業系統請求服務，作業系統接收到請求後，執行完報告結果給 user program。
- [方法1] Registers
  利用Registers ( 暫存器 )儲存這些參數
  優：速度快
  (∵不用作記憶體的存取 )
  缺：不適用於參數個數多的情況
- [方法2] Memory
  將參數利用 Table 或 Block 的方式儲存在 Memory 中，同時利用一個 Register 記錄此 Table 或 Block 的起始位址，並傳給O.S.
  優：適用於參數個數較多的情況
  缺：速度慢
- [方法3] Stack
  利用System Stack (堆疊 )。要存放參數時，將之Push 到Stack Stack，再由O.S.從Stack中Pop 取出參數。
  Stack在系統中比Register多一些(可用Mem. 或其它硬體(如：Register)來實作此Stack)
What is a process ? Draw the diagram of Process State. And describe every state. (15%)

What are the differences between thread and process ? Why do threads have much shorter context switching times ? (10%)

|        Thread                   |      Process              |
|---------------------------------|---------------------------|
| Light Weight Process            | Heavy Weight Process      |
|                                 |                           |
| 同一個Task (或 Process ) 內的   |  不同的Process之間無共享的| 
| Threads可以共享 Code Section,   |  Address Space，互為獨立。|
| Data Section, 及O.S. Resources  |                           |
|                                 |
| Context Switching 負擔輕        |  負擔重
|                                 |
| Thread的管理(Management)成本低  |  成本高
| (管理項目： Creation,           |
| Scheduling, Context, Switching  |
| Switching…etc.)                 |
|                                 |
| 一個Task內有多條Threads存在     |  一個Task內只有一條Thread
|  Process (Task) 內的某 Thread   |
| 被 Block，則可切到其它Thread    |
| 執行。此時，若 Process 內只要還有
| Thread 在執行，則 Process 不會被 Block

What is Multilevel Feedback-Queue scheduling ? (5%)

定義：
與Multilevel Queue的定義相似，差別在於允許 Process 在各佇列之間移動 ，以避免Starvation 的情況。
* 採取類似 “Aging” 技術，毎隔一段時間就將 Process 往上提升到上一層 Queue 中。∴在經過有限時間後，在 Lower Priority Queue 中的 Process 會被置於 Highest Priority Queue 中。故無 Starvation。
* 亦可配合降級之動作。當上層 Queue 中的 Process 取得 CPU 後，若未能在 Quantum 內完成工作，則此 Process 在放棄 CPU 後，會被置於較下層的 Queue 中。

Please draw the Gantt charts for the following processes for the problems shown below.
(a) Round-Robinm time quantum = 1 (5%)
(b) Shortest-remaining-time-first (Preemptive SJF) (5%)

            Arrival Time    Brust Time

    P1      0               4
    P2      1               2
    P3      2               3
    P4      5               4
    P5      6               3

(a) Time : 0
    Queue: P1

    0   1
    +---+
      P1

    先丟入 P2, P1 因為超時丟入，此時執行 P2
    Time : 1
    Queue: P2, P1

    0   1   2
    +---+---+
     P1  P2

    丟入 P3, P2 因為超時丟入，此時執行 P1
    Time : 2
    Queue: P1, P3, P2

    0   1   2   3
    +---+---+---+
     P1  P2  P1

    P1 因為超時丟入，此時執行 P3
    Time : 3
    Queue: P3, P2, P1

    0   1   2   3   4
    +---+---+---+---+
     P1  P2  P1  P3

    P3 因為超時丟入，此時執行 P2
    Time : 4
    Queue: P2, P1, P3

    0   1   2   3   4   5
    +---+---+---+---+---+
     P1  P2  P1  P3  P2

    P2 完成，P4 抵達丟入，此時執行 P1
    Time : 5
    Queue: P1, P3, P4

    0   1   2   3   4   5   6
    +---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1

    P5 抵達丟入，P1 超時丟入，此時執行 P3
    Time : 6
    Queue: P3, P4, P5, P1

    0   1   2   3   4   5   6   7
    +---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3

    P3 超時丟入，此時執行 P4
    Time : 7
    Queue: P4, P5, P1, P3

    0   1   2   3   4   5   6   7   8
    +---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4

    P4 超時丟入，此時執行 P5
    Time : 8
    Queue: P5, P1, P3, P4

    0   1   2   3   4   5   6   7   8   9
    +---+---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4  P5

    P5 超時丟入，此時執行 P1
    Time : 6
    Queue: P1, P3, P4, P5

    0   1   2   3   4   5   6   7   8   9  10
    +---+---+---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4  P5  P1

    P1 完成，此時執行 P3
    Time : 10
    Queue: P3, P4, P5

    0   1   2   3   4   5   6   7   8   9  10  11
    +---+---+---+---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4  P5  P1  P3

    P3 完成，此時執行 P4
    Time : 11
    Queue: P4, P5

    0   1   2   3   4   5   6   7   8   9  10  11  12
    +---+---+---+---+---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4  P5  P1  P3  P4

    ...

    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16
    +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
     P1  P2  P1  P3  P2  P1  P3  P4  P5  P1  P3  P4  P5  P4  P5  P4

課本習題快取區

ch 5

5.9 Why is it important for the scheduler to distinguish I/O-bound programs
from CPU-bound programs?
為什麼排程器需要需分 IO 密集程式 和 CPU 綁定程式 ?

Answer: I/O-bound programs have the property of performing only a small amount of computation before performing I/O. Such programs typically do not use up their entire CPU quantum. CPU-bound programs, on the other hand, use their entire quantum without performing any blocking I/O operations. Consequently, one could make better use of the computer’s resouces by giving higher priority to I/O-bound programs and allow them to execute ahead of the CPU-bound programs.
IO 密集程式占用 CPU 的時間比較少，大部分時間等在等待 IO。相反地 CPU 綁定程式則會大幅度占用 CPU 的時候，因為沒有 IO 操作的堵塞。

而 CPU bound 的程序因此而得到了很多調度機會並且每次都能把CPU run完。故在這樣的系統裡要給 I/O bound 的程序更高的優先級使其能被調度得更多些。

5.10 Discuss how the following pairs of scheduling criteria conflict in certain
settings.
a. CPU utilization and response time // CPU 使用率和回應時間的比較
b. Average turnaround time and maximum waiting time // 平均運轉時間和最大等待時間的比較
c. I/O device utilization and CPU utilization // IO 裝置使用率和 CPU 使用率的比較

Answer:
a. CPU utilization and response time: CPU utilization is increased if the overheads associatedwith context switching isminimized. The context switching overheads could be lowered by performing context switches infrequently. This could, however, result in increasing
the response time for processes.
CPU 的使用率(utilization)上升時，切換工作的次數(context-switches)就會少，由於切換次數少，則回應時間(response time)就會上升。

b. Average turnaround time and maximum waiting time: Average turnaround time is minimized by executing the shortest tasks first. Such a scheduling policy could, however, starve long-running tasks and thereby increase their waiting time.
平均運作時間(average turnaround time) 的最小化可以藉由最短工作優先的方式，但是這將會使得長工作飢餓，如此會增加最大等待時間。

c. I/O device utilization and CPU utilization: CPU utilization is maximized by running long-running CPU-bound tasks without performing context switches. I/O device utilization is maximized by scheduling I/O-bound jobs as soon as they become ready to run, thereby incurring the overheads of context switches.
CPU 使用率的上升盡可能讓 CPU 密集工作長時間運行 (減少 IO 或者其他的中斷所引發的上下文切換)，IO 裝置使用率的上升靠的是讓排入的 IO 密集工作立即地運行，但這會增加上下文切換的開銷。

5.11 Consider the exponential average formula used to predict the length of the next CPU burst.What are the implications of assigning the following values to the parameters used by the algorithm?
使用指數平均公式去預測下一次的 CPU burst。 t0 預估時間。
a. a = 0 and t0 = 100 milliseconds
b. a = 0.99 and t0 = 10 milliseconds

Answer: When a = 0 and t0 = 100 milliseconds, the formula always makes a prediction of 100 milliseconds for the next CPU burst. When a = 0.99 and t0 = 10 milliseconds, themost recent behavior of the process is given much higher weight than the past history associated with the process. Consequently, the scheduling algorithm is almost memoryless, and simply predicts the length of the previous burst for the next quantum of CPU execution.
當 a = 0 時，預測的時間總會是 100 milliseconds，而當 a = 0.99 時，將會高度依賴近期幾次的結果，而對於歷史關聯只有較輕的權重。
// t(n+1) = a * 上一次實際時間 + (1 - a) * t(n)

Reference

Kernel Mode 與 User Mode 的概念
 恐龍書中文 ppt 介紹聯合大學陳士杰

Read More +