CSCE 5313
Advanced Operating Systems
Fall, 2007
Course Syllabus
Instructor: Amy Apon
Office: JBHT 515
Phone: 575-6794
Email: aapon@uark.edu
Office Hours: TBD,
or by appointment
or whenever you can find me
Course Homepage: http://comp.uark.edu/~aapon/courses/gradOS/index.html
Posted Notes: http://comp.uark.edu/~aapon/courses/gradOS/Notes/
Prerequisite: CSCE 3613 (i.e., a good introductory operating systems course)
Required Text: “Distributed Systems: An Algorithmic Approach” by Sukumar Ghosh, Chapman & Hall/CRC, 2007.
Required Papers: This semester we will read approximately 20 research papers, some classical papers, some papers to expound on material in the text, and some recent papers.
Optional Supplemental Texts: Several difficult concepts and algorithms will be covered in this class. Supplemental material may prove helpful in providing alternative explanations of a particular topic, algorithm, or paper we discuss in class. Two appropriate supplemental texts are:
v "Distributed Operating Systems and Algorithms" by Randy Chow and Theodore Johnson, Addison Wesley Publishing Company, 1998.
v "Advanced Concepts in Operating Systems" by Mukesh Singhal and Niranjan G. Shivaratri, McGraw-Hill Publishing Company, 1997.
Grading: Paper summaries and class discussion 10%
Quizzes 30% (best 8 quizzes)
Two programming homework exercises 10%
Project and presentation 30%
Comprehensive final exam 20%
Overall Class Structure: Operating systems consist of a federation of important topic threads. To understand each topic, readings from the text and research papers will be covered. Each week, readings from the text and the research papers will be assigned. These are to be completed prior to being discussed in class. The class will be interactive and discussion oriented. Quizzes and written paper summaries will reinforce the material.
Paper Summaries and Class Discussion: Students are to turn in a hardcopy summary of each paper prior to our discussion of that paper. (Written summaries of assigned textbook readings are not required, although the text will help explain some of the concepts covered by the papers, and vice versa.) Each paper summary is to be no longer than five sentences. Following each summary, two questions are to be posed that you would like to ask the author(s). These questions will be used to stimulate our discussion of the paper. It is also expected that each student will engage in our class discussions as appropriate. The purpose of this component is motivate quality time prior to each class reading and thinking about each paper.
Quizzes: Periodically at the beginning of class, a 15 minute quiz will be given. These quizzes will be announced prior to the class and will be over the paper and text material discussed the previous week, after the papers and text have been discussed in class. The quizzes will usually be short answer, often problem based. For example, they may consist of a short example of an algorithm presented in one of the papers or in the text. The purpose of this component is to motivate reflection and digestion of each paper after it has been discussed.
Programming Exercises: At least two programming exercises will be
required and graded during the semester.
The purpose of the programming exercises is to gain practical experience
of the concepts and algorithms that are described in the papers. Programming exercises are to be done on your
own unless stated otherwise. It is OK to
talk to another student about the exercise or about the concepts in the
exercise, but code should not be shared.
Programs will be graded via demonstration during a scheduled meeting in
my office.
Project and Presentation: Students are expected to work in two-person teams. Each team is to select a topic. Throughout the semester, each team will research their chosen topic and prepare a paper, which is expected to be submitted to an appropriate conference for publication. At the end of the semester, each team will prepare (and present to the class) a 15 minute PowerPoint presentation that will be appropriate to give at the conference. The purpose of this component is stimulate the process of going from critically reading research papers written by others to the generation of research papers written by new research students.
Comprehensive Final Exam: On Monday, December 10, from 10:00am-12noon, a comprehensive final examination will be given. Since this is a required course, it is important that each student demonstrate mastery of the material covered in the course. The exam will likely follow the format of the quizzes. The purpose of this component is to revisit each of the topics and papers that have been covered in the course to distill their primary contributions and their larger impact to be remembered beyond this course.
Tentative Schedule: Since this course is very much a work in progress, the various topics/papers/dates are subject to change and will be posted throughout the semester.
Week
1: Remote Procedure Call
Text
material: Ghosh, chapter 2
Papers:
v
Aug.
23: Andrew Birrell and Bruce Nelson,
“Implementing Remote Procedure Calls,” ACM Transactions on Computer Systems,
Volume 2(1), February, 1984, Pages 39—59.
http://comp.uark.edu/~aapon/courses/gradOS/rpc.pdf
v
If
time: Brian Bershad, Thomas Anderson,
Edward Lazowska, and Henry Levy, “Lightweight Remote Procedure Call,”
Transactions on Computer Systems, Volume (1), February, 1990, Pages 37—55. http://comp.uark.edu/~aapon/courses/gradOS/lrpc.pdf
v
Aug.
28: Programming using Remote Method
Invocation on your own: http://java.sun.com/j2se/1.5.0/docs/guide/rmi/hello/hello-world.html
Programming Assignment One:
Implement the RMI client and server in the Java tutorial. Modify the sample code by having the server
implement a single state variable, count, that is incremented each time it is
contacted. Return this value to the
client and print it as output each time you execute the client. I will grade this by having you demonstrate
your code to me during the week of Sept. 3.
Week
2 and 3: Mutual Exclusion
Text
material: Ghosh, chapter 7
Papers:
v
August
30: C.A.R. Hoare, "Monitors: an operating systems structuring
concept," Communications of the ACM. Volume 17 (10). October,
1974. Pages 549--557. http://comp.uark.edu/~aapon/courses/gradOS/monitor.pdf
v
September 4:
Programming using threads
Programming Assignment Two: Use the thread library of your choice (e.g., pthreads, Java threads, OpenMP, …) to implement the following:
Suppose that you've just been hired by Mother Nature to help her out with the chemical reaction to form water, which she doesn't seem to be able to get right due to synchronization problems. The trick is to get two H atoms and one O atom all together at the same time. The atoms are threads, as follows:
§ Each H atom thread executes a procedure hReady. Use sleep to cause the thread to wait until it is ready to react.
§ Each O atom thread executes a procedure oReady, and also sleeps until it is ready to react.
§ After they are ready to react, the threads must delay until there are at least two H atoms and one O atom present, and then one of the threads must call the procedure makeWater (which just prints out a debug message that water was made). Use condition variables (or waiting on an object in the case of Java) for synchronization. Be sure that you wait on the condition or object, and do not use busy waiting.
§ After the makeWater call, two instances of hReady and one instance of oReady should call thread_exit().
§ The main driver program should use join to wait for all threads to exit.
Write the code for hReady, oReady, the procedure makeWater(), and the main driver program. To make your program easier to grade, print out a message each time that a thread is created, each time that it awakes from sleep, and just before it exits. Have main create at least 20 "H atoms" and at least 10 "O atoms"
I will
grade this by having you demonstrate your code to me during the week of Sept.
10.
v September 11: Mamoru Maekawa, "A sqrt(N) algorithm for mutual exclusion in decentralized systems," ACM Transactions on Computer Systems. Volume 3 (2). May, 1985. Pages 145-159. http://comp.uark.edu/~aapon/courses/gradOS/maekawa.pdf
v September 18: Quiz 3 and talk by Patricia Kirkwood
Week
4: Program Correctness and Self-Stabilization
Text
material: Ghosh, chapters 5 and 17
Paper:
v September 20: Edsger W. Dijkstra, "Self-stabilizing Systems in Spite of Distributed Control," Communications of the ACM, Volume 17 (11), November, 1974, Pages 643--644. http://comp.uark.edu/~aapon/courses/gradOS/dijkstra.pdf
Week
5: Time
Text
material: Ghosh, chapter 6
Paper:
v September 25: Leslie Lamport, "Time, clocks, and the ordering of events in a distributed system," Communications of the ACM. Volume 21 (7), July, 1978. Pages 558-565. http://comp.uark.edu/~aapon/courses/gradOS/clocks.pdf
Weeks
6: Snapshots
Text
material: Ghosh, chapter 8
Papers:
v September 27: K. Mani Chandy and Leslie Lamport, “Distributed snapshots: determining global states of distributed systems,” ACM Transactions on Computer Systems. Volume 3 (1), 1985. Pages 63—75.
v (On your own: Edgar Knapp, "Deadlock Detection in Distributed Databases," ACM Computing Surveys, Volume 19 (4), December, 1987, Pages 303--328. http://comp.uark.edu/~aapon/courses/gradOS/knapp.pdf )
Week
7: Global State and Deadlock
Text
material: Ghosh, chapter 9
Papers:
v October 4: K. Mani Chandy, Jayadev Misra, and Laura M.Haas, "Distributed Deadlock Detection," ACM Transactions on Computer Systems. Volume 1 (2). May, 1983. Pages 144--156. http://comp.uark.edu/~aapon/courses/gradOS/chandy.pdf
v October 9: Jayadev Misra and K.M. Chandy, "Termination Detection of Diffusing Computations in Communicating Sequential Processes," ACM Transactions on Programming Languages and Systems, Volume 4 (1), January, 1982, Pages 37--43. http://comp.uark.edu/~aapon/courses/gradOS/misra.pdf
Week
8: Consensus
Text
material: Ghosh, chapters 11 and 13
Papers:
v October 11: Leslie Lamport, Robert Shostak, and Marshall Pease, "The Byzantine Generals Problem," ACM Transactions on Programming Languages and Systems. Volume 4 (3), July, 1982. Pages 382-401. http://comp.uark.edu/~aapon/courses/gradOS/byzantine.pdf
v (On your own: Miguel Castro and Barbara Liskov, “Practical Byzantine Fault Tolerance,” Proceedings of the Third Symposium on Operating Systems Design and Implementation, New Orleans, LA, February 1999. http://comp.uark.edu/~aapon/courses/gradOS/castro.pdf )
v
October
16: Review of shared memory
programming. Assign Shared Memory
Program Three, http://comp.uark.edu/~aapon/courses/gradOS/sharedmemATM.html
This
will be due on Tuesday, October 30.
Email your solution to me. Class
will be cancelled on Tuesday, October 30.
This is worth 10 points and the grade will replace your grade on Shared
Memory Program Two.
Week
9: Scheduling
Papers:
v October 18: Carl Waldspurger and William Weihl, “Lottery Scheduling: Flexible Proportional-Share Resource Management,” Proceedings of the First Symposium on Operating Systems Design and Implementation, Usenix, November 1994. http://comp.uark.edu/~aapon/courses/gradOS/waldspurger.pdf
v ( On your own: Carl Waldspurger and William Weihl, “Stride Scheduling: Deterministic Proportional-Share Resource Management,” http://comp.uark.edu/~aapon/courses/gradOS/waldspurger.weihl.pdf )
Week
10: Data Management and File Systems
Text
material: Ghosh, chapter 16
Papers:
v October 23: Text material: Ghosh, chapter 16
v October 23: Ghemawat, Gobioff, and Leung, “The Google File System,” Symposium on Operating Systems Principles, 2003. http://citeseer.ist.psu.edu/cache/papers/cs2/704/http:zSzzSzwww.cs.rochester.eduzSzsosp2003zSzpaperszSzp125-ghemawat.pdf/ghemawat03google.pdf
v (On your own: John Howard, Micael Kazar, Sherri Menees, David Nichols, M. Satyanarayanan, Robert Sidebotham, and Michael West, “Scale and Performance in a Distributed File System,” ACM Transactions on Computer Systems, Volume 6(1), February, 1988, Pages 51—81. http://www.cs.duke.edu/education/courses/spring06/cps210/papers/s11.pdf)
v (On your own: James Kistler and M. Satyanarayanan, “Disconnected Operation in the Coda File System,” ACM Transactions on Computer Systems, Volume 10(1), February, 1992, Pages 3—25. http://comp.uark.edu/~aapon/courses/gradOS/kistler.pdf )
v October 30: No class. Do the take-home quiz: Answer the odd questions only on the GFS handout: http://comp.uark.edu/~aapon/courses/gradOS/Notes/1023gfs.handout.pdf Turn this in to me on paper by putting it into my mail box in the CSCE office. This should be done on your own.
Week
11: Distributed Shared Memory
Text
material:
Papers:
v November
1: Jelica Protic,
Tentative Programming Assignment Four: Using RMI, implement a simple object-based DSM system. We will discuss this in class and possibly do this exercise, depending on time constraints, project constraints, and other factors. If we do this it will take the place of 2-3 quiz grades.
Week
12: Performance
Text
material:
Papers:
v November 8: Thomas Anderson, Brian Bershad, Edward Lazowska, and Henry Levy, "Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism," ACM Transactions on Computer Systems. Volume 10 (1). February 1992. Pages 53 -- 79. http://comp.uark.edu/~aapon/courses/gradOS/anderson.pdf
v November 13: Edmund Nightingale, Peter Chen, and Jason Flinn, “Speculative Execution in a Distributed File System,” SOSP’05, October, 2005, Brighton, UK. http://comp.uark.edu/~aapon/courses/gradOS/nightingale.pdf
Week
13: P2P Networks
Text
material: Ghosh, chapter 21
Papers:
v
November 20: Ion Stoica, Robert Morris, David
Karger, M. Frans Kaashoek, Hari Balakrishnan, “Chord: A Scalable Peer-to-peer
Lookup Service for Internet Applications,” SIGCOMM’01,
v November 27: Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy, “An Analysis of Internet Content Delivery Systems,” Proceedings of the 5th Symposium on Operating Systems Design and Implementation, Boston, Massachusetts, December 2002. http://comp.uark.edu/~aapon/courses/gradOS/saroiu.pdf
Week
14: Project Presentations
v
November
29
v
December
4
Dead
Day is December 5!
Week 15: Comprehensive
Final Exam: Monday, December
10, 10:00am-12noon
Some other good papers:
November 6th: Thorsten von Eicken, Anindya Basu, Vineet Buch, and
Werner Vogels, “U-Net: A User-Level Network Interface for Parallel and
Distributed Computing,” Proceedings of
the 15th ACM Symposium on Operating Systems Principles,
November 27th: Mike Chen, Anthony Accardi, Emre Kiciman, Jim Lloyd, Dave Patterson, Armando Fox, and Eric Brewer, “Path-Based Failure and Evolution Management,” Proceedings of the First Symposium on Networked Systems Design and Implementation, San Francisco, CA, March 2004. http://comp.uark.edu/~aapon/courses/gradOS/chen.pdf