Colloquia are typically hosted in Halligan Hall, room 102, on Thursdays from 2:50 - 4:00 pm. Refreshments are at 2:50, and talk starts at 3:00 pm. Please check details below to confirm times and locations. Email any questions to email@example.com.
April 25 at 1 PM: Doctoral Thesis Defense: Neda Hassanpour
Title: Doctoral Thesis Defense: Computational Methods to Advance Directed Evolution of Enzymes and Metabolemics Data Analysis
Speaker: Neda Hassanpour, Tufts University
Time: 1:00 PM
Location: Halligan 209
Host: Soha Hassoun
Abstract: The engineering of living cells promises to advance many applications including synthetic biology and personalized medicine. Experimental efforts, however, can be costly and time-consuming, requiring large efforts to interpret collected data and many iterative design-and-test cycles to achieve desired results. Computational efforts that harness the exponential growth of computing power and catalogued biological data can advance biological system design by interpreting measurements, efficiently exploring the design space and expediting biological discoveries.
This thesis advances state-of-the-art in the engineering and analysis of cellular metabolism by computationally addressing two challenges. The first challenge concerns the lack of systematic ways to design selection pathways in directed evolution of enzymes, an iterative process of creating mutant libraries and choosing desired phenotypes through screening or selection until the enzymatic activity reaches a desired goal. Identifying high-throughput screens or selections to isolate the variant(s) with the desired property is the biggest challenge in directed enzyme evolution, as there are currently no known generalized strategies or computational techniques to do so. This thesis presents a computational metabolic engineering framework, termed Selection Finder (SelFi), to construct a selection pathway from a desired enzymatic product to a cellular host and to couple the pathway with cell survival. The second challenge concerns the interpretation of data measured through untargeted metabolomics, where molecular masses of thousands of small molecules are measured simultaneously via mass spectrometry. Annotating the masses by assigning them a chemical identity and interpreting their biological relevance is challenging, as a particular mass may be associated with multiple chemical compounds. This thesis contributes to solving the metabolite interpretation challenge in two ways. This thesis presents a novel computational workflow, termed Expanded Metabolic Model based Annotation (EMMA). EMMA constructs a biological filter consisting of an Expanded Metabolic Model (EMM) that includes not only the canonical substrates and products of enzymes, but also metabolites that can form due to substrate promiscuity, where an enzyme transforms other substrates in addition to its natural substrate. This expanded model is used to reduce the number of candidate chemical identities from large chemical databases that can be assigned to the measurements Further contributing to metabolite interpretation, this thesis presents a novel probabilistic approach, termed Probabilistic modeling for Untargeted Metabolomics Analysis (PUMA), for predicting the likelihood of activity of metabolic pathways by assigning measurements directly to metabolic pathways and then deriving probabilistic assignment of measurements to candidate chemical identities.
April 25 at 2 PM: Master's Thesis Defense: Melissa Iori
Title: Master's Thesis Defense: Behavior-Based Cognitive Trait Modeling Via Node-Link Diagram Interactions
Speaker: Melissa Iori, Tufts University
Time: 2:00 PM
Location: Halligan 102
Host: Remco Chang
Abstract: Individual traits are not always considered when designing data analytic software. However, previous research has shown that individual differences do matter when presenting visual content. Our experimental results demonstrate the ability to determine a user's extraversion in real time by analyzing their interactions with a data visualization found on many platforms, the node-link diagram. The results show that users with higher extraversion moused-over more nodes than those with lower extraversion, showing a distinct difference in behavior between user trait groups. I then discuss the results of the experiment, and the impact this finding has for user trait modeling.
April 26 at 9:30 AM: Undergraduate Senior Honors Thesis Defense: Exploring the Role of Pfam Families in Protein Function
Title: Undergraduate Senior Honors Thesis Defense: Exploring the Role of Pfam Families in Protein Function
Speaker: Daniel Meyer
Time: 9:30 AM
Location: Halligan 209
Host: Lenore Cowen
Abstract: It is widely accepted that the structure of proteins determine their function. Many computational inference methods that act on proteins or sets of proteins rely on assigned functional labels from a popular ontology, often the Gene Ontology (GO). On the other hand, there is rich, easily obtained structural motif information that is captured for proteins using libraries of Hidden Markov Models (HMMs), such as Pfam. Because structure relates to function, and many of these HMM-based-motifs are actually signatures of 3-dimensional substructures within the overall protein fold, it would follow that annotation of these HMM domains in a protein structure should aid in inference of its correct GO functional labels. However, creating a useful Pfam to GO mapping remains a difficult endeavor, first, because it is certainly not a one-to-one mapping. Second, different Pfam-derived domains within a protein structure, either individually, or as a set, might yield different amounts of specificity in regards to the set of possible GO labels that are appropriate. Estimating the amount of specificity that a single, or set of, Pfam-derived domains gives, in regards to GO labeling, is confounded by the unequal representation and/or the lack of coverage of annotation in both domains across the protein universe. We revisit issues of sequence patterns, diversity, and representation in the light of all the new data in current sequence databases. We have developed a suite of parsers and an Object-Relational Mapping using Python and SQLAlchemy to represent selected information of proteins and families from the UniProt and Pfam databases respectively, while making it easy to access and reason about information stored in the graphical structures of the GO and Evidence Code Ontology (ECO). We use this framework to compare dcGO (Fang and Gough, 2013) and GODM (Alborzi et al., 2017), which are designed to optimize different tradeoffs for coverage versus false-positives.
April 26 at 11 AM: Undergraduate Senior Honors Thesis Defense: Detangling PPI Networks to Uncover Functionally Meaningful Clusters
Title: Undergraduate Senior Honors Thesis Defense: Detangling PPI Networks to Uncover Functionally Meaningful Clusters
Speaker: Sarah Hall-Swan
Time: 11:00 AM
Location: Halligan 209
Host: Lenore Cowen
Abstract: We compare computational methods for decomposing a PPI network intoa non-overlapping modules. A method is preferred if it results in a large proportion of nodes being assigned to functionally meaningful modules, as measured by functional enrichment over terms from the Gene Ontology (GO). We compare the performance of three popular community detection algorithms that produce non-overlapping clusters with the same algorithms run after the network is pre-processed by removing and reweighting based on the diffusion state distance (DSD) between pairs of nodes in the network. We call this “detangling” the network. In some cases, we find that detangling the network based on the DSD distance reweighting provides more meaningful clusters. We look at extending to methods that produce overlapping clusters.
April 26 at 3 PM: Understanding the Anonymity Ecosystem
Speaker: Micah Sherr, Georgetown University
Time: 3:00 PM
Location: Halligan 102
Host: Susan Landau
Abstract: Depending upon whom you ask, anonymity systems such as Tor are either (1) used by activists, journalists, and ordinary users to gain unfettered access to the Internet and bypass restrictive censorship systems, or (2) they are used by criminals to perform network attacks, buy and sell illegal narcotics and weapons, and post and access illicit/illegal content (e.g., on the "Dark Web"). While privacy advocates stress the personal freedom and privacy afforded by anonymity systems, others claim that anonymity networks are ripe with abuse. Despite some anecdotal evidence (in both directions), in truth, we know very little about who actually uses anonymity systems such as Tor and how these systems are actually being used.
This talk presents recent and ongoing research, done in collaboration with researchers at the U.S. Naval Research Laboratory (NRL) and the University of New South Wales Sydney, that attempts to shine some light on the users (in the aggregate) of these anonymity systems. A key scientific challenge of this line of work is to perform measurements in a safe manner that does not endanger the users of anonymity networks by potentially exposing their identities (and possibly subjecting them to physical harm). At the same time, since anonymity networks are often volunteer-operated and thus subject to malicious participants, measurement techniques should also provide strong integrity guarantees that resist manipulation by malicious insiders.
Finally, the talk will describe ongoing work that examines the Internet's open proxy servers -- hosts that allow any Internet user to relay traffic through it. Open proxies are often used for anonymous communication or to avoid regional restrictions placed on content (e.g., streaming movies and sporting events). We present early results that show that misbehavior abounds on the Internet's open proxies. In particular, we highlight instances in which open proxies perform TLS man-in-the-middle, inject or modify content, and otherwise interfere with users' end-to-end communication.
Bio: Micah Sherr is Provost's Distinguished Associate Professor in the Computer Science Department at Georgetown University and director of the Georgetown Institute for Information Assurance. His academic interests include privacy-preserving technologies, electronic voting, wiretap systems, and network security. He participated in two large-scale studies of electronic voting machine systems, and helped to disclose numerous architectural vulnerabilities in U.S. election systems. His current research examines the security properties of legally authorized wiretap (interception) systems and investigates methods for achieving scalable, high-performance anonymous routing. Micah received his B.S.E., M.S.E., and Ph.D. degrees from the University of Pennsylvania. He is a recipient of the NSF CAREER award.