MIT 6.1810 Operating System Engineering
课程地址:https://pdos.csail.mit.edu/6.1810/2022/index.html
新坑,MIT 6.1810
# 6.1810 2022 Lecture 1: O/S overview
# Overview
6.1810 goalsUnderstand operating system (O/S) design and implementation
Hands-on experience extending a small O/S
Hands-on experience writing systems software
What is the purpose of an O/S?Abstract the hardware for convenience and portability
Multiplex the hardware among many applications
Isolate applications in order to contain bugs
Allow sharing among cooperating applications
Control sharing for security
Don’t get in the way of high performance
Support a wide range of applications
Organization: layered picture[user/kernel diagram]
- user applications: vi,gcc,DB,&c
- kernel services
- h/w: CPU,RAM,disk,net,&c
we care a lot about the interfaces and internal kernal structure
What services does ans O/S kernel typically provide?process (a running program)
memory allocation
file contents
file names,directories
access control (security)
many others: users,IPC,network,time,terminals
What’s the application / kernel interface?“System calls"
Examples,in C,from UNIX (e.g. Linux,macOS,FreeBSD):
fd = open(out”,1);
write(fd,“hello\n”,6);
pid = fork();
There look like function calls but they aren’t"
Why is O/S design+implementation hard and interesting?many design tensions:
- efficient vs abstract/portable/general-purpose
- powerful vs simple interfaces
- flexible vs secure
features interact:
fd = open;
fork()
uses are varied: laptops,smart-phones,cloud,virtual machines,embedded
evolving hardware: NVRAM,multi-core,fast networks
unforgiving environment: quirky h/w,hard to debug
You’ll be glad you took this coourse if you…care about what gose on under the hood
like infrastructure
need to track down bugs or security problems
care about high performance
# Class structure
Online course information:https://pdos.csail.mit.edu/6.1810/ – schedule,assignments,labs
Piazza – announcements,discussion,lab help
LecturesO/S ideas
case study of xv6,a small O/S,via code and xv6 book
lab background
O/S papers
submit a question about each reading,before lecture.
Labs:The point: hands-on experience
Mostly one week each.
Three kinds:
Systems programming (due next week…)
O/S primitives,e.g. thread switching.
O/S kernel extensions to xv6,e.g. network.
Use piazza to ask/answer lab questions.
Discussion is great,but please do not look at others’ solutions!
Grading:70% labs,based on tests (the same tests you run).
20% lab check-oof meetings: we’ll ask you about randomly-selected labs.
10% home-work and class/piazza discussion.
No exams,no quizzes.
Note that most of the grade is from labs. Start them early!
# Introduction to UNIX system calls
Introduction to UNIX system calls
Applications see the O/S via system calls; that interface will be a big focus.
let’s start by looking at how programs use system calls.
you’ll use these system calls in the first lab.
and extend and improve them in subsequent labs.
I’ll show some examples, and run them on xv6.
xv6 has similar structure to UNIX systems such as Linux.
but much simpler -- you'll be able to digest all of xv6
accompanying book explains how xv6 works, and why
why UNIX?
open source, well documented, clean design, widely used
studying xv6 will help if you ever need to look inside Linux
xv6 has two roles in 6.1810:
example of core functions: virtual memory, multi-core, interrupts, &c
starting point for most of the labs
xv6 runs on RISC-V, as in 6.004
you’ll run xv6 under the qemu machine emulator
copy.c
copy input to output
read bytes from input, write them to the output
copy |
copy.c is written in C
Kernighan and Ritchie (K&R) book is good for learning C
you can find these example programs via the schedule on the web site
read() and write() are system calls
passed to kernel to tell it which “open file” to read/write
must previously have been opened
an FD connects to a file/device/socket/&c
a process can open many files, have many FDs
UNIX convention: fd 0 is “standard input”, 1 is “standard output”
read() may read less, but not more
UNIX I/O is 8-bit bytes
interpretation is application-specific, e.g. database records, C source, &c
where do file descriptors come from?
open.c
create a file
open |
open() creates a file,
FD is a small integer
FD indexes into a per-process table maintained by kernel
[user/kernel diagram]
these examples ignore errors – don’t be this sloppy!
Figure 1.2 in the xv6 book lists system call arguments/return
or look at UNIX man pages, e.g. “man 2 open”
what happens when a program calls a system call like open()?
looks like a function call, but it’s actually a special instruction
hardware saves some user registers
hardware increases privilege level
hardware jumps to a known “entry point” in the kernel
now running C code in the kernel
kernel calls system call implementation
sys_open() looks up name in file system
it might wait for the disk
it updates kernel data structures (file block cache, FD table)
restore user registers
reduce privilege level
jump back to calling point in the program, which resumes
we’ll see more detail later in the course
shell
I’ve been typing to UNIX’s command-line interface, the shell.
the shell prints the “$” prompts.
the shell lets you run UNIX command-line utilities
useful for system management, messing with files, development, scripting
ls |
UNIX supports other styles of interaction too
window systems, GUIs, servers, routers, &c.
but time-sharing via the shell was the original focus of UNIX.
we can exercise many system calls via the shell.
fork.c
create a new process
the shell creates a new process for each command you type, e. g. for
echo hello |
fork |
the kernel makes a copy of the calling process
instructions, data, registers, file descriptors, current directory
“parent” and “child” process
only difference:
thus:
fork.c’s “fork() returned” executes in both processes
the “if(pid == 0)” allows code to distinguish
ok, fork lets us create a new process
how can we run a program in that process?
exec.c
replace calling process with an executable file
how does the shell run a program, e. g.
echo a b c |
a program is stored in a file: instructions and initial memory created by the compiler and linker
so there’s a file called echo, containing instructions
exec |
exec() replaces current process with an executable file
discards instructions and memory from the file
preserves file descriptors
exec(filename, argument-array)
argument-array holds command-line arguments; exec passes to main()
cat uesr/echo.c
echo.c shows how a program looks at its command-line arguments
forkexec.c
fork() a new process,exec() a program
forkexec |
forkexec.c contains a common UNIX idiom:
fork() a child process
exec() a command in the child
parent wait()s for child to finish
the shell does fork/exec/wait for every command you type
after wait(), the shell prints the next prompt
to run in the background – & – the shell skips the wait()
exit(status) -> wait(&status)
status convention: 0 = success, 1 = command encountered an error
note: the fork() copies, but exec() discards the copied memory
this may seem wasteful
you’ll transparently eliminate the copy in the “copy-on-write” lab
refirect.c
redirect the output of a command
what does the shell do for this?
echo hello > out |
answer: fork, change FD 1 in child, exec echo
redirect |
note: open() always chooses lowest unused FD; 1 due to close(1).
fork, FDs, and exec interact nicely to implement I/O redirection
separate fork-then-exec give child a chance to change FDs before exec FDs provide indirection
commands just use FDs 0 and 1, don’t have to know where they go exec preserves the FDs that sh set up
thus: only sh has to know about I/O redirection, not each program
design decisions
It’s worth asking “why” about design decisions:
Why these I/O and process abstractions?
Why not something else?
Why provide a file system?
Why not let programs ues the disk their own way?
Why FDs?
Why not pass a filename to write()?
Why are files streams of bytes, not disk blocks or formatted records?
Why not combine fork() and exec()?
The UNIX design works well, but we will see other designs!
pipe1.c
communicate through a pipe
how does the shell implement
ls | grep x |
an FD can refer to a “pipe”, as well as a file
the pipe() system call creates two FDs
read from the first FD
write to the seconnd FD
the hernel maintains a buffer for each pipe
[u/k diagram]
write() appends to the buffer
read() waits until there is data
pipe2.c
communicate between process
pipes combine well with fork() to implement ls | grep x
shell creates a pipe,
then forks (twice),
then connects ls’s FD 1 to pipe’s write FD,
and grep’s FD 0 to the pipe
[diagram]
pipe -- a simplified version |
pipes are a separate abstraction, but combine well w/ fork()
list.c
list files in a directory
how does ls get a list of the files in a directory?
you can open a directory and read it -> file names
“.” is a pseudo-name for a process’s current directory
see ls.c for more details
Summary
We’ve looked at UNIX’s I/O, file system, and process abstractions.
The interfaces are simple – just integers and I/O buffers.
The abstractions combine well, e. g. for I/O rediretion.
// copy.c: copy input to output. |
|
// exec.c: replace a process with an executable file |
// fork.c: create a new process |
|
|
// open.c: create a file, write to it. |
// pipe1.c: communication over a pipe |
|
|
# Lab util: Unix utilities
requirement
You can do these labs on an Athena machine or on your own computer. If you use your own computer, have a look at the lab tools page for setup tips.
If you use Athena, you must use an x86 machine; that is, uname -a
should mention i386 GNU/Linux
or i686 GNU/Linux
or x86_64 GNU/Linux. You can log into a public Athena host with
ssh -X athena.dialup.mit.edu |
We have set up the appropriate compilers and simulators for you on Athena. To use them, run
add -f 6.828 |
You must run this command every time you log in (or add it to your ~/.environment
file). If you get obscure errors while compiling or running qemu
, check that you added the course locker.
Fetch the git repository for the xv6 source for the lab:
git clone git://g.csail.mit.edu/xv6-labs-2022 |
The repo is setup so that git checkouts the util
branch when cloning the repo.
git status |
The xv6-labs-2022 repository differs slightly from the book’s xv6-riscv; it mostly adds some files. If you are curious look at the git log:
git log |
The files you will need for this and subsequent lab assignments are distributed using the Git version control system. For each of the labs you will checkout (git checkout util) a version of xv6 tailored for that lab. To learn more about Git, take a look at the Git user’s manual, or, you may find this CS-oriented overview of Git useful. Git allows you to keep track of the changes you make to the code. For example, if you are finished with one of the exercises, and want to checkpoint your progress, you can commit your changes by running:
git commit -am 'my solution for util lab exercise 1' |
You can keep track of your changes by using the git diff
command. Running git diff
will display the changes to your code since your last commit, and git diff origin/util
will display the changes relative to the initial util code. Here, origin/util is the name of the git branch with the initial code you downloaded for the class.
Build and run xv6:
make qemu |
If you type ls
at the prompt, you should see output similar to the following:
ls |
These are the files that mkfs
includess in the initial file system; most are programs you can run. You just ran one of them:ls.
xv6 has no ps
command, but, if you type Ctrl - p, the kernel will print information about each process. If you try it now, you’ll see two lines: one for init
, and one for sh
.
To quit qemu type: Ctrl-a x (press Ctrl and a at the same time, followed by x).
# Grading and hand-in procedure
You can run make grade
to test your solutions with the grading program. The TAs will use the same grading program to assign your lab submission a grade. Separately, we will also have check-off meetings for labs(see Grading policy).
The lab code comes with GNU Make rules to make submission easier. After committing your final changes to the lab, type make handin
to subtime your lab. For detailed instructions on how to submit see below.
code
关于实验
因为阿里云服务器挂载网站并且是 centos 操作系统,虚拟机又会导致我的电脑卡顿,因而我选择了在云服务器利用 docker 容器安装 Ubuntu 完成整个实验。(后续会写 docker 学习文章,一定不鸽,咕咕咕)
首先,自己配置好环境哈!
然后安装课程所需软件
我的环境是 Ubuntu 20.04,因此运行命令
sudo apt-get install git build-essential gdb-multiarch qemu-system-misc gcc-riscv64-linux-gnu binutils-riscv64-linux-gnu |
之后你可以按照上面的 note 中提供的 shell 命令去运行!
如果你不使用 Athena,请忽略前两条指令;请注意我在部分代码块中加入了预期的运行结果,你可以通过进行 $
区分。
requirement
Implement the UNIX program sleep
for xv6; your sleep
should pause for a user-specified number of ticks. A tick is a notion of time defined dy the xv6 hernel, namely the time between two interrupts from the timer chip. Your solution should be in the file user/sleep.c
.
Some hints:
- Before you start coding, read Chapter 1 of the xv6 book.
- Look at some of the other programs in
user/
(e.g.,user/echo.c
,user/grep.c
,anduser/rm.c
) to see how you can obtain the command-line arguments passed to a proggram. - If the user forgets to pass an argument, sleep should print an error message.
- Use the system call
sleep
. - See
kernel/sysproc.c
for the xv6 kernel code that implements thesleep
system call (look forsys_sleep
),user/user.h
for the C definition ofsleep
callable from a user program, anduser/usys.s
for the assembler code that jumps from user code into the kernel forsleep
. main
should callexit(0)
when it is done.- Add your
sleep
program toUPROGS
in Makefile; once you’ve done that,make qemu
will compile your program and you’ll be able to run it from the xv6 shell. - Look at Kernighan and Ritchie’s book The C programming language (second edition)(K&R) to learn about C.
Run the program from the xv6 shell:
make qemu |
Your solution is correct if your program pauses when run as shown above. Run make grade
to see if you indeed pass the sleep tests.
Note that make grade
runs all tests, including the ones for the assignments below. If you want to run the grade tests for one assignment, type:
./grade-lab-util sleep |
This will run the grade tests that match “sleep”. Or, you can type:
make GRADEFLAGS=sleep grade |
which does the same.
code
|
requirement
Write a program that uses UNIX system calls to “ping-pong” abyte between two processes over a pair of pipes, one for each direction. The parent should send a byte to the child; the child should print “user/pingpong.c
Some hints:
- Use
pipe
to create a pipe. - Use
fork
to create a child. - Use
read
to from a pipe, andwrite
to a pipe. - Use
getpid
to find the process ID of the calling process. - Add the program to `UPROGS’ in Makefile.
- User programs on xv6 have a limited set of library functions available to them. You can see the list in
user/user.h
; the source (other than for system calls) is inuser/ulib.c, user/printf.c,
anduser/umalloc.c
.
Run the program from the xv6 shell and it should produce the following output:
make qemu |
Your solution is correct if your program exchanges a byte between two processes and produces output as shown above.
code
首先,很明显,我们需要两条管道传递信息。
对于管道传参 int pipefd[2]
, pipefd[0]
为管道读端, pipefd[1]
为管道写端。
而且读取成功的情况下, pipe
函数会返回读取到的字节数。
记得使用 fork
创建子进程,以及使用 read, write, close
对管道进行操作。
|
requirement
Write a concurrent version of prime sieve using pipes. This idea is due to Doug Mcllroy, inventor of Unix pipes. The picture halfway down this page and the surrounding text explain how to do it. Your solution should be in the file user/primes.c
.
Your goal is to use pipe
and fork
to set up the pipeline. The first process feeds the numbers 2 through 35 into the pipeline. For each prime number, you will arrange to create one process that reads from its left neighbor over a pipe and writes to its right neighbor over another pipe. Since xv6 has limited number of file descriptors and processes, the first process can stop at 35.
Some hints:
- Be careful to close file descriptors that a process doesn’t need, because otherwise your program will run xv6 out of resources before the first process reaches 35.
- Once the first process reaches 35, it should wait until the entire pipeline terminates, including all children, grandchildren, &c. Thus the main primes process should only exit after all the output has been printed, and after all the other primes processes have exited.
- Hint:
read
returns zero when the write-side of a pipe is closed. - It’s simplest to directly write 32-bit (4-byte)
int
s to the pipes, rather than using formattedASCII I/O
. - You should create the processes in the pipeline only as they are needed.
- Add the program to
UPROGS
in Makefile.
Your solution is correct if it implements a pipe-based sieve and produces the following output:
make qemu |
code
这个实验的关键是利用 pipe
和 fork
,详细思想请看上个 note
给出的链接。
主要思想就是保证管道头部的数字为素数并删除在管道中是该数字倍数的数,其余的数写入下一个管道,如此反复。
|
requirement
Write a simple version of the UNIX find program: find all the files in a directory tree with a specific name. Your solution should be in the file user/find.c
.
Some hints:
- Look at
user/ls.c
to see how to read directories. - Use recursion to allow find to descend into sub-directories.
- Don’t recurse into “.” and “…”.
- Changes to the file system persist across runs of qemu; to get a clean file system run
make clean
and thenmake qemu
. - You’ll need to use C strings. Have a look at K&R (the C book), for example Section 5.5.
- Note that == does not compare strings like in Python. Use strcmp() instead.
- Add the program to
UPROGS
in Makefile.
Your solution is correct if produces the following output (when the file system contains the files b, a/b
and a/aa/b
):
make qemu |
code
根据提示,我们可以根据 ls.c
文件编写 find.c
文件。
find 和 ls 的异同
- find 需要找到指定目录中所有文件名为 filename 的文件,并打印路径 + 文件名
- ls 打印出指定目录中的所有目录项
- 都需要输入
path
参数 - 都需要判断
path
参数所指向的文件类型 - 都需要遍历目录项并读取目录项的名字
- find 还需要给定
filename
参数 - find 需要递归遍历指定目录中的所有子目录
因此,我们实现 find 需要在 ls 的基础上添加递归遍历子目录并查找指定文件名的操作,同时需要忽略掉 .
、 ..
,防止重复递归。
|
requirement
Write a simple version of the UNIX xargs program: its arguments describe a command to run, it reads lines from the standard input, and it runs the command for each line, appending the line to the command’s arguments. Your solution should be in the file user/xargs.c
.
The following example illustrates xarg’s behavior:
echo hello too | xargs echo bye |
Note that the command here is “echo bye” and the additional arguments are “hello too”, making the command “echo bye hello too”, which outputs “bye hello too”.
Please note that xargs on UNIX makes an optimization where it will feed more than argument to the command at a time.
We don’t expect you to make this optimization. To make xargs on UNIX behave the way we want it to for this lab, please run it with the -n option set to 1. For instance
(echo 1 : echo 2) | xargs -n 1 echo |
Some hints:
- Use
fork
andexec
to invoke the command on each line of input. Usewait
in the parent to wait for the child to complete the command. - To read individual lines of input, read a character at a time until a newline (’\n’) appears.
- lernel/param.h declars MAXARG, which may be useful if you need to declare an argv array.
- Add the program to
UPROGS
in Makefile. - Changes to the file system persist across runs of qemu; to get a clean file system run make clean and then make qemu.
xargs, find, and grep combine well:
find . b | xargs grep hello |
will run “grep hello” on each file named b in the directories below “.”.
To test your solution for xargs, run the shell script xargstest.sh. Your solution is correct if it produces the following output:
make qemu |
You may have to go back and fix bugs in your find program. The output has many $ because the xv6 shell doesn’t realize it is processing commands from a file instead of from the console, and prints a $ for each command in the file.
code
xargs
命令是一种管道命令, |
即为管道,他会将管道前面的命令的输出作为后面命令的标准输入。
|
requirement
- Write an uptime program that prints the uptime in terms of ticks using the
uptime
system call.(easy) - Support regular expressions in name matching for
find. grep.c
has some primitive support for regular expressions.(easy) - The xv6 shell(user/sh.c) is just another user program and you can improve it. It is a minimal shell and lacks many features found in real shell. For example, modify the shell to not print a $ when processing shell commands from a file(moderate), modify the shell to support wait(easy), modify the shell to support lists of commands, separated by “;”(moderate), modify the shell to support sub-shells by implementing “(” and “)” (moderate), modify the shell to support tab completion(easy), modify the shell to keep a history of passed shell commands(moderate), or anything else you would like your shell to do.(If you are very ambitious, you may have to modify the kermel to support the kernel features you need; xv6 doesn’t support mach.)
月缺不改光,剑折不改钢
共矜然诺心,各负纵横志