Interesting in doing research for your summer job? The Undergraduate Student Research Awards (USRA) are meant to stimulate your interest in research in the natural sciences and engineering. Often faculty will have projects that might be of interest. Please see the projects listed below from previous years. Projects for the 20172018 will be posted soon, keep looking back as more projects may be posted at any time. Students are also welcome to contact our faculty members and suggest a topic of interest. In the end, the best projects are those of interest to faculty and to students.
If you are interested in this opportunity, please contact WahMing Wong or any CMS faculty. Applications for USRAs start early in January every year. Click here to find out more information about USRA's and how to apply.
2017  2018 USRA Projects
20172018 USRA Projects in CMS 
UTeach: A Community Curated Peer Instruction Resource Repository for CS Education
CS education, particularly the secondary and early postsecondary level often requires educators to spend a great deal of time designing and developing curricula and materials. Improving access to high quality materials is a key step in improving access to CS education, particularly for students from underrepresented communities. This project seeks to develop and evaluate methods of sharing educational materials and developing a community of practice among CS educators. 
Assessment of the predictive power of pedagogical elements on outcomes in computer science With increasing enrolment in CS programs, it is becoming increasingly important to evaluate students early in their educational careers in order to accurately predict their probability of success in a program. It is also important from a pedagogical point of view that we develop and use modes of assessment that accurately reflect not only what a student has learned, but also how well they will be able to apply their knowledge in future. This project aims to evaluate the ability of various factors to predict future success in computer science degrees. 
Knots Invariants The student will explore questions in knot theory (starting with the textbook by W.B.R. Lickorish). One fundamental question is defining invariants of knots using ChernSimons gauge theory (as in Witten's fundamental 1989 paper "Quantum Field Theory and the Jones Polynomial". Witten defines the Jones polynomial as a path integral (over a space of connections on a threemanifold such as the threesphere, which is the same as threedimensional space with an extra point added at infinity). It is basic to understand what kinds of knot invariants can be defined in this way. The student will spend the first few weeks learning background material (from Lickorish's book and the book "The Geometry and Physics of Knots" by Michael Atiyah). The latter part of the project will be spent on defining and calculating knot invariants as integrals. 
Please contact one of our CMS faculty members and suggest a topic of interest. 
Previous Years (2016  2017) USRA Projects
20162017 USRA Projects in CMS 

Polytopes and moduli spaces The space of represerntations of the fundamental group of a 2manifold into the group SU(2) of unitary 2x2 matrices with determinant 1 acquires a circle action on an open dense set once we choose a simple closed curve C in the 2manifold. If two such curves disjoint, the circle actions commute, so the group that acts is U(1) x U(1). The moment maps for these actions are known (see Jeffrey and Weitsman, Commun. Math. Phys. 1992). For an oriented 2manifold of genus g >1, it is possible to choose 3g3 such disjoint curves which define a decomposition of the 2manifold into copies of the 3holed sphere (``pants decomposition''). The purpose of the project is to explicitly study the images of the moduli space under the moment map, for concrete examples of the curves C. 
Gamification of CS1 courses and its effect on student learning Introductory computer science courses are often faced with the challenge of presenting material in a way that is both accessible to novice programmers and interesting to those with more experience. Furthermore, research has shown that one of the best predictors of student success in first year courses is a student’s ability to monitor their learning in a meaningful way. This project will research various ways in which computer science pedagogy can be improved through gamification (the use of gamelike mechanics and reward systems to improve user engagement). The main focus of the project will be on the development of tools and techniques to allow students to self monitor their progress through introductory computer science courses, and developing experimental setups for evaluating the efficacy of these tools. 
Using natural language processing to understand political arguments This project will develop automatic methods that can look at arguments made by politicians in the Canadian Parliament on various issues and figure out not just what the speaker is advocating but the specific reasons and, ideally, the structure and logic of the argument being made. The goal of the project is to provide political scientists and historians with NLP tools for largescale analysis of complex political texts, including the Canadian Parliamentary proceedings back to the year 1900. The student will work with a senior graduate student on annotation and analysis of argumentative texts. 
Using natural language processing to determine cause of death in a verbal autopsy When someone dies, knowing the cause of death is important for statistical purposes in public health planning. In developing countries, "verbal autopsies" are interviews with family members about the death that substitute for a physical autopsy. This project, working in conjunction with the Million Death Study (http://www.cghr.org/projects/milliondeathstudyproject/), is developing automatic NLP methods to read these verbal autopsies and automatically determine the cause of death. As one facet of this, the student will research entity extraction techniques to apply to the narrative text of the interviews. 
_______________________________________________________________________________________________________
Past USRA Projects
(20152016) Projects
20152016 USRA Projects in CMS  

Using statistics / machine learning to improve storage system reliability The storage backend of a computer system is one of its most critical components, as its reliability is crucial: even small amounts of data loss can be devastating for an enterprise. The goal of this project is to improve storage system reliability by better understanding the characteristics of one of its key building blocks, hard disk drives. We have obtained a dataset covering detailed trace information about more than 40,000 disk drives in a production environment. As a first step we want to perform a statistical analysis to identify factors that are correlated with drive failures (e.g. temperature, workload). As a second step, we want to build a predictor that can serve as an early warning for impending drive failures. Note: No background knowledge about disk drive technology (or computer hardware) is required. Background in statistics or machine learning is important. 

Behaviour of functions under maps of twomanifolds Outline: The purpose of this project is to understand the following question. Find a function f on the twosphere with only one maximum and one minimum. Compute the integral of this function over the the subset of the twosphere where its value is less than a given value c. This gives a function of the parameter c. This question would be easy to answer for the standard twosphere, but will be more challenging for a surface that is only known to be a twosphere on general topological grounds. Student's Role: The student will first spend some time learning basic differential topology (including how to use differential forms). Some later work will be numerical, using software such as Maple, Matlab or Mathematica. 

Improving storage performance in virtualized datacenters
While computer systems have for a long time relied on caching to improve I/O performance (remember for example the operating system buffer cache from your operating systems course), caching is becoming more complicated in today's virtualized systems, for several reasons. First, multiple workloads (tenants) will share the same caches, and one might affect the cache performance of another. Second, there is typically a hierarchy of caches, e.g. one at the host and one at the storage server. The goal of this project is to revisit traditional caching policies (such as LRU, leastrecentlyused) and see how they can be improved for modern data centres running cloud workloads. 

Predicting failures of MapReduce jobs
MapReduce has become a very popular programming model for parallelizing the analysis of large data. While lots of research has been done on improving the *performance* of MapReduce jobs, our preliminary work indicates that more attention should be paid to their *fault tolerance*. We find that a large fraction of computer cycles in MapReduce clusters is wasted on jobs that later fail or get killed. The goal of this project is to investigate predictors for job failures. Such predictions could, for example, be used to turn on fault tolerance mechanisms, such as checkpointing, in jobs that are likely to fail, or to run those jobs at a lower priority, so that they don't take away resources from other jobs. To facilitate this work, we have a large dataset of job traces recorded at Google, including detailed information for each job, including for example the name of the application and user, the number of tasks in the job, the resource requirements, etc. Note: No background on MapReduce is needed. This is pure machine learning & data mining project. 

Gamification of CS1 courses and its effect on student learning
Introductory computer science courses are often faced with the challenge of presenting material in a way that is both accessible to novice programmers and interesting to those with more experience. Furthermore, research has shown that one of the best predictors of student success in first year courses is a student’s ability to monitor their learning in a meaningful way. This project will research various ways in which computer science pedagogy can be improved through gamification (the use of gamelike mechanics and reward systems to improve user engagement). The main focus of the project will be on the development of tools and techniques to allow students to self monitor their progress through introductory computer science courses, and developing experimental setups for evaluating the efficacy of these tools. 
_______________________________________________________________________________________________________
(20142015) Projects
Project 

Emulating a database processing unit.
Hardwareaccelerated database engines promise unique capabilities and excellent power/performance characteristics, but require an intensive design process. In this project, part of an ongoing "Bionic database engine" project, the USRA student would coordinate with a PhD student to bring up a software framework that will emulate our proposed hardware/software hybrid database engine. The USRA project would include helping implement the framework, but also using it to explore the various tradeoffs that will inform the eventual hardware design. Qualified students would have experience with C/C++ programming and software design, as well as a willingness to interact with large code bases. 
Hardware accelerators for database workloads.
A hybrid hardwaresoftware database engine provides a unique opportunity to offload various pieces of functionality to an FPGA in order to improve power/performance or exploit unique capabilities offered hardware. This USRA project would involve selecting one or two (out of several possible) pieces of database functionality and implementing them in hardware using a hardware description language such as Verilog or SystemC. The work will be performed in collaboration with a PhD student in the context of an ongoing "Bionic database engine" project. Qualified students would have some experience with FPGAbased hardware design, including basic knowledge of hardware logic, and of hardware description languages such as Verilog. 
The Statistical Analysis of Structural Equation Models
Structural equation modeling is a statistical methodology used in a wide variety of disciplines such as psychology and economics. There are variety of computational and inferential problems associated with these models. The intent of the project is to develop and apply new computational and inferential methodology in this area building on some recent work with factor analysis models. 
Netty  a Prover's Assistant
A Prover's Assistant is a tool designed to aid in construction and verification of proofs. A Prover's Assistant called Netty is currently under development in the Software Engineering group at the University of Toronto. We are looking for USRA participants to work on several projects that involve Netty: 1) Extending Netty for proving correctness of Probabilistic Algorithms, Quantum Algorithms, and Quantum Communication Protocols. Candidates must be comfortable with logic, proofs, and basic linear algebra, and have strong programming skills (Java). No background in Quantum Computing is required. Familiarity with theorem proving a plus. 2) Implementing a number of heuristics to improve the proofs generated with Netty: expression simplification, suggestion generation, etc Candidates must be comfortable with logic and proofs and have strong programming skills (Java). Familiarity with theorem proving a plus. 3) Investigating the possibility of amending Netty for use in introductory CS courses, such as CSCA67, CSCB36, CSCB63, etc. Candidates must be comfortable with logic and proofs and have strong programming skills (Java). Familiarity with theorem proving a plus. 
Geometric Algorithms
A soap film on a loop of wire naturally forms a surface of minimal area. There are algorithms to determine the area of a surface like this, but if we make the problem more complicated by raising the dimension or asking for surfaces of a particular form, we don't know the answer. Are there good algorithms for solving problems like this, or is the answer possibly NPcomplete? 
Differential forms on the 2sphere Take coordinates on the 2sphere and a map from the 2sphere to itself. If we pull back the standard volume form on the 2sphere under this map, find conditions for this pullback to be a multiple of the volume form by a This project involves learning about differential topology (a selfcontainedre ference is Guillemin and Pollack, Introduction to Differential Topology). The only background required is MATB42. 