A brief information about classes I took in UBC. What I've learned, the most important lesson I'd like to remember, and some fun facts.
Yes, you read it right, as in the class instructor Nick Harvey's prologue "All algorithms equally matter, but some matter more than others. - George Orwell (paraphrased)", this class was about algorithms used in practice. The class was particularly interesting as I always wanted to have an algorithmic insight to the systems at scale. We covered many topics relevant to the systems deployed today, including MapReduce, Submodular optimization, Graph sketching, Locality sensitive hashing, Online bipartite matching, AdWords, and Erasure codes, among many others. We had several assignments to solidify the concepts and also a final project.
For the final project, I teamed up with two other students to apply submodular optimization techniques to place a TensorFlow graph in a distributed environment (many GPUs, CPU, and potentially TPUs). Our intuition was that node-to-device placement problem can be reduced to a submodular maximization problem and solved to find the placement with the shortest model training time. You can find more details in our final report.
This was a lecture, assignments, exams, and presentation based class by Ronald Garcia. The entire course can be summarized in a single sentence, as the instructor put it "[...] we model programs as mathematical objects in set theory, and programming languages as sets of programs and their meanings", which pretty much describes everything we learned in the class. If this sentence sounds too abstract to you, as it did to me before I took the class, here is the list of topics such set-theoretic approach serves as a foundation for: language sematics (both structural operational and big-step), proof by induction, reduction sematics, induction and coinduction, divergence, Floyd-Hoare program logics, procedures and recursion, type systems, static analysis and some more. At the end of the class, two peer students and I closely studied a paper, NetKAT: Semantic Foundations for Networks from POPL 2014, and presented it using the concepts we got introduced to during the class. Download our presentation here.
Yet another interesting seminar style, assignments and project based Systems class taught by Ivan Beschastnikh. Papers covered many topics with distributed systems angle, including clocks and snapshot in distributed systems, state machine replication, memory coherence, Paxos, fault tolerance, consistency, CAP theorem, DHT and P2P systems. Assignments were fun, time synchronization in distibuted systems using Berkeley algorithm, leader election and load-balanced key-value service. The fun part of the assignments was requirement to use Go programming language only. This is the first time I started my Go programming and I'm very likely to stick with this language as much as I can. It's fast, well-designed, Github-friendly (write once read many times), and fun language.
Our class project, with fellow grad student, was about managing virtual middlebox state in container environment. The idea was to deploy and orchestrate middlebox functionality with containers (Docker) and Kubernetes cluster manager. The insight was that Kubernetes provides good abstractions for deploying wide range of applications, including middleboxes. In particular, pod is well-suited to wrap any middlebox functionality (or chain them together) and manage with replication controller. Replication controller becomes very handy to scale middleboxes up/down to respond to the dynamic load. Finally, services provide good abstraction to glue pods, including placement of middlebox pod layer on application processing pipeline. We developed sample rate-limiting firewall as Docker, wrapped it with pods, and scaled using replication controller. We then used etcd to manage firewall's shared state and measured overhead (latency) imposed to the firewall due to externally managed shared-state. Our source code available on Github, here are slides and the report.
This was a mixed, lecture style (but with papers), assignments and project-based class taught by very enthusiastic professor Alan Hu. We read papers to get the general knowledge of the formal verification methods, and solidified them by doing assignments. Assignments were well-designed, large part of the code written by Alan, but with purposefully injected bug. Students were expected to find bug(s) using techniques introduced in the lecture and the papers. Topics we covered include boolean (binary) decision diagrams for program state representation, static analysis, (bounded) model checking, concrete and symbolic execution, symbolic reachability, constraint (SAT and SMT) solvers, predicate abstraction, and formal methods to represent discrete, continous and hybrid system, among others.
My class project was about using SMT solvers for data center resource allocation problem. I ended up reproducing previous research from FMCAD'13, and made an attemp to use different SMT solver (SAT Modulo Monotonic Theories) to improve performance. Check out project report, presentation, and code in Github. Oh, by the way, as a bonus, Alan arranged our class project presentations on Whistler, one of the best ski resorts in the World! Project itself and presentation was fun!
This was another interesting seminar style, project based Systems class taught by Ivan Beschastnikh. Papers covered wide range software engineering, programming paradigms/tools to build distributed system, their debugging, performance analysis and etc. Course was a great blend between software engineering and (distributed) systems. For the class project, with the peer-grad student we've decided to continue my earlier (Data at Scale) class project. It was about study of increased number of syscalls at file systems. This time we wanted to conduct much deeper study and find the reason behind bloated system calls. Earlier study was on Mac OS X environment, impossible to analyze because of proprietary closed source code. Thus, we moved to Solaris 11 enviroment, where we got open source editor and analyzed syscalls starting from application layer all the way down to the drivers. We concluded that libraries are primary responsible for bloated calls, file system API does not contribute much for increased number of syscalls. Our scripts and report is available online.
Also known as Game Theory grad class, covering from simple two-player games to all the way to mechanism design, auctions was taught by award-winning instructor Kevin Leyton-Brown. This was the most math challenging class so far. I'm sure my brain has been structurally transformed (especially while solving assignment problems), there should be lots of additional complex lines after this class. Once again I confirmed for myself that intuitional approach (rather than equation-driven math) to be much efficient (and fun) way of solving the problem.
My class project was on Social Choice. It was based on literature review. I described axiomatic approach to the Ranking Systems. Although Google's PageRank algorithm was the main subject of the paper, I touched on eBay's reputation system and personalized ranking systems as well. My presentation is available here. Also, while deciding on the project topic, I've got interested in voting rules (area of Social Choice too). Then I set up small study on UBC's the most recent (2014) AMS election. I collected public results and wrote Python scripts to analyze the votes. I've basically replicated their mechanism. Meanwhile, I got closely familiar with voting mechanisms, too. My scripts are available here.
This was my first Systems course, taught by Andrew Warfield. We've read from classic and to the most recent papers about file systems, distrubuted computing, data center networking and almost everything has to do with large scale systems. This class has improved my academic paper reading/analyzing skills more than order of magnitude. I've got a bit more confident on presenting, too. Still a lots of space for improvement.
As a class project, with the other peer grad student, we decided to do measurement study on file system use by modern applications. Our work was inspired by the best paper at SOSP'11 conference A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications. We've analyzed filesystem calls of several text editors and two web browsers. As a result, we not only reproduced result from the original paper, but wlso observed quite different behavior. Our project report is available here.
This was introductory class to HCI product design, taught by Karon MacLean Most of our effort went to the class project, where 3 grad students and I developed mobile application to connect moving-in/out people with movers. Later we discovered about UShip.com and then ended up developing mobile interface for it. Inspired by LinkedIn, our project was called TruckedIn, peer-to-peer connection between movees and movers.
Focus of the class project was not development of application itself, but to experience several stages of HCI development - finding a human problem need better support, run user study to clarify actual need and pain points, and only then design a user-centric prototype to evaluate usability of our solution. Surprizingly (really!) we won Best Recruting award for working with truck drivers. Indeed, it took a lots of efforts to find, work with them. It was fun too, movers were excited to learn about our solution and try our cool iOS app. Project report is available here.