osdi 2021 accepted papers

In the Ethereum network, decentralized Ethereum clients reach consensus through transitioning to the same blockchain states according to the Ethereum specification. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. Perennial 2.0 makes this possible by introducing several techniques to formalize GoJournals specification and to manage the complexity in the proof of GoJournals implementation. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. Secure hardware enclaves have been widely used for protecting security-critical applications in the cloud. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. PLDI seeks outstanding research that extends and/or applies programming-language concepts to advance the field of computing. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. Research Impact Score 9.24. . Based on this observation, P3 proposes a new approach for distributed GNN training. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. Kernel code requires manual memory management and type-unsafe code and must efficiently handle complex, asynchronous events. Marius is open-sourced at www.marius-project.org. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. We convert five state-of-the-art PM indexes using Nap. 23 artifacts received the Artifacts Functional badge (88%). However, with the increasingly speedy transactions and queries thanks to large memory and fast interconnect, commodity HTAP systems have to make a tradeoff between data freshness and performance degradation. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. This year, there were only 2 accepted papers from UK institutes. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. Because DistAI starts with the strongest possible invariants, if the SMT solver fails, DistAI does not need to discard failed invariants, but knows to monotonically weaken them and try again with the solver, repeating the process until it eventually succeeds. However, Addra improves message latency in this architecture, which is a key performance metric for voice calls. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. Today, privacy controls are enforced by data curators with full access to data in the clear. We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. Last year, 70% of accepted OSDI papers participated in the . If in doubt about whether your submission to OSDI 2021 and your upcoming submission to SOSP are the same paper or not, please contact the PC chairs by email. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. Session Chairs: Dushyanth Narayanan, Microsoft Research, and Gala Yadgar, TechnionIsrael Institute of Technology, Jinhyung Koo, Junsu Im, Jooyoung Song, and Juhyung Park, DGIST; Eunji Lee, Soongsil University; Bryan S. Kim, Syracuse University; Sungjin Lee, DGIST. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. The OSDI '21 program co-chairs have agreed not to submit their work to OSDI '21. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. OSDI takes a broad view of the systems area and solicits contributions from many fields of systems practice, including, but not limited to, operating systems, file and storage systems, distributed systems, cloud computing, mobile systems, secure and reliable systems, systems aspects of big data, embedded systems, virtualization, networking as it relates to operating systems, and management and troubleshooting of complex systems. Furthermore, to enable automatic runtime optimization, GNNAdvisor incorporates a lightweight analytical model for an effective design parameter search. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content: browsers make it easy for web developers to reason about page state by serially executing all scripts on any frame in a page, but as a result, fail to leverage the multiple CPU cores that are readily available even on low-end phones. Authors must limit their responses to (a) correcting factual errors in the reviews or (b) directly addressing questions posed by reviewers. Session Chairs: Ryan Huang, Johns Hopkins University, and Manos Kapritsos, University of Michigan, Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan, Columbia University. JEL codes: Q18, Q28, Q57 . He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. When registering your abstract, you must provide information about conflicts with PC members. Professor Veloso earned a Bachelor and Master of Science degrees in Electrical and Computer Engineering from Instituto Superior Tecnico in Lisbon, Portugal, a Master of Arts in Computer Science from Boston University, and Master of Science and PhD in Computer Science from Carnegie Mellon University. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. A scientific paper consists of a constellation of artifacts that extend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. As has been standard practice in OSDI and SOSP in recent years, we will allow authors to submit quick responses to PC reviews: they will be made available to the PC before the final online discussion and PC meeting. (Registered attendees: Sign in to your USENIX account to download these files. USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Title Page, Copyright Page, and List of Organizers, OSDI '21 Proceedings Interior (PDF, best for mobile devices). Grand Rapids, Michigan, United States . Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Message from the Program Co-Chairs. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. HotCRP.com signin Sign in using your HotCRP.com account. The 15th USENIX Symposium on Operating Systems Design and Implementation seeks to present innovative, exciting research in computer systems. Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Research Petuum CASL research and engineering team's Pollux technical paper on adaptive scheduling for optimized. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. Submissions violating the detailed formatting and anonymization rules will not be considered for review. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. Taking place in Carlsbad, CA from 11-13 July, OSDI is a highly selective flagship conference in computer science, especially on the topic of computer systems. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. Owing to the sequential write-only zone scheme of the ZNS, the log-structured file system (LFS) is required to access ZNS solid-state drives (SSDs). We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. For more details on the submission process, and for templates to use with LaTeX, Word, etc., authors should consult the detailed submission requirements. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. We will look at various problems and approaches, and for each, see if blockchain would help. For conference information, see: . A hardware-accelerated thread scheduler makes sub-nanosecond decisions, leading to high CPU utilization and low tail response time for RPCs. The co-chairs may then share that paper with the workshops organizers and discuss it with them. How can we design systems that will be reliable despite misbehaving participants? An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. Hence, CLP enables efficient search and analytics on archived logs, something that was impossible without it. Contact your program co-chairs, osdi21chairs@usenix.org, or the USENIX office, submissionspolicy@usenix.org. Distributed Trust: Is Blockchain the answer? The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. Despite having the same end goals as traditional ML, FL executions differ significantly in scale, spanning thousands to millions of participating devices. Manuela will present examples and discuss the scope of AI in her research in the finance domain. If you have any questions about conflicts, please contact the program co-chairs. Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. We implement a variant of a log-structured merge tree in the storage device that not only indexes file objects, but also supports transactions and manages physical storage space. PC members are not required to read supplementary material when reviewing the paper, so each paper should stand alone without it. Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. This is the first OSDI in an odd year as OSDI moves to a yearly cadence. Jason Mohoney and Roger Waleffe, University of WisconsinMadison; Henry Xu, University of Maryland, College Park; Theodoros Rekatsinas and Shivaram Venkataraman, University of WisconsinMadison. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. ), Program Co-Chairs: Angela Demke Brown, University of Toronto, and Jay Lorch, Microsoft Research. As a member of ACCT, I have served two years on the bylaws and governance committee and two years on the finance and audit committee. The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. DMon speeds up PostgreSQL, one of the most popular database systems, by 6.64% on average (up to 17.48%). Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning, Oort: Efficient Federated Learning via Guided Participant Selection, PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections, Modernizing File System through In-Storage Indexing, Nap: A Black-Box Approach to NUMA-Aware Persistent Memory Indexes, Rearchitecting Linux Storage Stack for s Latency and High Throughput, Optimizing Storage Performance with Calibrated Interrupts, ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction, DMon: Efficient Detection and Correction of Data Locality Problems Using Selective Profiling, CLP: Efficient and Scalable Search on Compressed Text Logs, Polyjuice: High-Performance Transactions via Learned Concurrency Control, Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing, The nanoPU: A Nanosecond Network Stack for Datacenters, Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator, Scalable Memory Protection in the PENGLAI Enclave, NrOS: Effective Replication and Sharing in an Operating System, Addra: Metadata-private voice communication over fully untrusted infrastructure, Bringing Decentralized Search to Decentralized Services, Finding Consensus Bugs in Ethereum via Multi-transaction Differential Fuzzing, MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation, Zeph: Cryptographic Enforcement of End-to-End Data Privacy, It's Time for Operating Systems to Rediscover Hardware, DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols, GoJournal: a verified, concurrent, crash-safe journaling system, STORM: Refinement Types for Secure Web Applications, Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation, SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs, Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads, GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs, Marius: Learning Massive Graph Embeddings on a Single Machine, P3: Distributed Deep Graph Learning at Scale. In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. DeSearch uses trusted hardware to build a network of workers that execute a pipeline of small search engine tasks (crawl, index, aggregate, rank, query). Swapnil Gandhi and Anand Padmanabha Iyer, Microsoft Research. This paper presents the design and implementation of CLP, a tool capable of losslessly compressing unstructured text logs while enabling fast searches directly on the compressed data. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . The file system performance of the proposed ZNS+ storage system was 1.33--2.91 times better than that of the normal ZNS-based storage system. Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. We present the results of a 1% experiment at fleet scale as well as the longitudinal rollout in Googles warehouse scale computers. Main conference program: 5-8 April 2022. There are two major GNN training obstacles: 1) it relies on high-end servers with many GPUs which are expensive to purchase and maintain, and 2) limited memory on GPUs cannot scale to today's billion-edge graphs. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. We implement and evaluate a suite of applications, including MICA, Raft and Set Algebra for document retrieval; and we demonstrate that the nanoPU can be used as a high performance, programmable alternative for one-sided RDMA operations. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. As increasingly more sensitive data is being collected to gain valuable insights, the need to natively integrate privacy controls in data analytics frameworks is growing in importance. My paper has accepted to appear in the EuroSys2020; I will have a talk at the Hotstorage'19; The Paper about GCMA Accepted to TC; Paper abstracts and proceedings front matter are available to everyone now. All deadline times are 23:59 hrs UTC. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. Second, Fluffy uses multiple existing Ethereum clients that independently implement the specification as cross-referencing oracles. will work with the steering committee to ensure that the symposium program will accommodate presentations for all accepted papers. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. Researchers from the Software Systems Laboratory bagged a Best Paper Award at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021). While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. Kyuhwa Han, Sungkyunkwan University and Samsung Electronics; Hyunho Gwak and Dongkun Shin, Sungkyunkwan University; Jooyoung Hwang, Samsung Electronics. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. This paper demonstrates that it is possible to achieve s-scale latency using Linux kernel storage stack, even when tens of latency-sensitive applications compete for host resources with throughput-bound applications that perform read/write operations at throughput close to hardware capacity. If you submit a paper to either of those venues, you may not also submit it to OSDI 21. One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. Our further evaluation on 38 CVEs from 10 commonly-used programs shows that SanRazor reduced checks suffice to detect at least 33 out of the 38 CVEs. The program co-chairs will use this information at their discretion to preserve the anonymity of the review process without jeopardizing the outcome of the current OSDI submission. The full program will be available in May 2021. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. Poor data locality hurts an application's performance. Therefore, developers typically find data locality issues via dynamic profiling and repair them manually. Reviews will be available for response on Wednesday, March 3, 2021. OSDI 2021 papers summary. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. This paper presents Dorylus: a distributed system for training GNNs. Papers so short as to be considered extended abstracts will not receive full consideration. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. Collaboration: You have a collaboration on a project, publication, grant proposal, program co-chairship, or editorship within the past two years (December 2018 through March 2021). Youngseok Yang, Seoul National University; Taesoo Kim, Georgia Institute of Technology; Byung-Gon Chun, Seoul National University and FriendliAI. Questions? OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review.

Do I Have Bedroom Eyes Quiz, Oscar Baylin Goodman Jr, African Mythological Objects, Articles O

osdi 2021 accepted papers