CNS 2017 News
- CNS Researchers Help Google Fight Abusive Pins on Google Maps
A partnership between computer scientists in the Center for Networked Systems (CNS) at UC San Diego and Google has allowed the search giant to reduce by 70 percent fraudulent business listings in Google Maps. The researchers worked together to analyze more than 100,000 fraudulent listings to determine how scammers had been able to avoid detection-albeit for a limited amount of time-and how they made money.
The team presented their findings at the 26th International Conference on the World Wide Web in Australia earlier this month.
The computer scientists identified what they describe as a “new form of blackhat search engine optimization that targets local listing services” such as Google Maps. They also describe how these scammers were able to make money.
“Location-based search is increasingly becoming the way people interact with online content-even if you’re not using a mapping application,” said Alex C. Snoeren, a professor in the Department of Computer Science and Engineering at UC San Diego and a senior author of the study.
For example, when people run a search on their mobile phone, the search engine uses their physical location as one of the inputs to decide which results to display, Snoeren explained.
The scammers take advantage of this by using fake locations to make it look like their business is in close proximity to the user doing the search. This was particularly true of on-call contractors, notably plumbers and locksmiths. Researchers found that 40 percent of all fake listings on Google Maps belong to that category.
“I might find seven listings for locksmiths in my neighborhood,” said Danny Huang, the paper’s first author and a Ph.D. student in computer science at the Jacobs School of Engineering at UC San Diego. “But in fact, none of those listings are real.”
In all, researchers found that 11 percent of overall search results for locksmiths were fraudulent. In New York, that percentage went up to 15.6 percent. And it went up to an astonishing 83.3 percent in West Harrison, New York.
Scammers are able to make money when they get called to help a user based on a fake listing. Scammers might quote a low price when called on the phone, only to charge a higher fee when they show up. They might not be licensed but get the business anyway.
In another scheme, scammers set up fake pins for real hotels or restaurants on Google Maps. They set up websites where customers make reservations, which are connected to the business’ real website or to a travel agency, which is not part of the scam. This allows scammers to make money either by getting a commission for each reservation or for referring traffic to the businesses’ real websites. The researchers found that roughly 13 percent of the fraudulent listings had real hotel and restaurant addresses, but were not created by these businesses.
All these fraud schemes were possible primarily because scammers found a way to get around Google’s verification process.
Businesses can register for Google Maps online for free. But before a listing goes live, Google sends a postcard with a verification code to the business’ address. The business inputs this verification code and the listing is then approved to go live.
Partly thanks to these measures, Google is able to detect 85 percent of fake listings before they even appear on Google Maps. The fake listings that make it past the verification process are taken down within an average of 8.6 days between creation and suspension.
Scammers got around verification requirements mainly by leasing PO boxes and using those addresses to receive their verification codes. They also added fake suite numbers to a specific address so Google wouldn’t get suspicious about a large number of businesses located at the same address. Researchers note that there are legitimate reasons for a large number of businesses to have the same address—big office buildings in Manhattan come to mind.
Researchers also noted that a large percent of fraudulent listings changed their address or the category they belonged to (from hotel to locksmith, for example) after verification.
To tamp down on abuse, Google has taken a number of measures, which the company details in a post on its research blog. Steps include: prohibiting bulk registration at most addresses; preventing businesses from changing their addresses to a location that is impossibly far from the original without additional verification; and detecting and ignoring intentionally mangled text in address fields designed to confuse Google’s algorithms. The company also fine-tuned its anti-spam machine learning systems to detect data discrepancies that are common in fake or deceptive listings.
The research was partially funded by a grant from the National Science Foundation.
*D.Y. Huang, D. Grundman, K. Thomas, A. Kumar, E. Bursztein, K. Levchenko and A.C. Snoeren, “Pinning Down Abuse on Google Maps,” Proc. of the International Conference on World Wide Web (WWW), April 3-7, 2017, Perth, Australia.
- Recent Computer Science Faculty Hire Joins Center for Networked Systems
Arun Kumar Works on Advanced Analytics at Intersection of Data Management and Machine Learning
On April 3, Computer Science and Engineering (CSE) assistant professor Arun Kumar began teaching his first undergraduate course since joining the UC San Diego faculty in 2016. CSE 190D covers topics in database system implementation, and it’s a hands-on, systems-focused course and the first at UC San Diego to teach the systems guts of a relational database management system (DBMS).
“Faculty in our Database group hope that this course will eventually be mainstreamed as 132C,” said Kumar. “It would complete a solid triad of database courses for undergraduates covering principles, applications and, finally, implementation.”
Kumar joined CSE after completing his Ph.D. at the University of Wisconsin-Madison last summer, with a focus on datamanagement and analytics. His research explores the intersection of data management and machine learning (ML), an area increasingly called advanced analytics. He also aims to create a pipeline of students coming into this burgeoning field – and the subject of the first graduate course he taught, CSE 291, during the winter quarter. “Advanced analytics is a brand-new field and companies require a lot of talent in this space,” he observed. “The dearth of engineers who understand machine learning is staggering, and a lot of companies are offering large salaries for people who understand software engineering, data systems and machine learning under the now-famous job title — data scientist.”
Advanced analytics is also the subject of a presentation Kumar will give for the Center for Networked Systems (CNS) on Tuesday, April 11 at 1pm in room 4140 of the CSE Building. His talk, “Democratizing Distributed Advanced Analytics,” will explore large-scale data analytics using statistical machine learning and how they are becoming increasingly critical for many data-driven applications.
“The data management, machine learning and systems communities are working on scalable and fast implementations of ML algorithms,” said Kumar. “However, several orthogonal bottlenecks in the end-to-end process of building and deploying ML models for data analytics have largely been ignored, leading to wasted resources and poor productivity of data scientists.”
CNS’s newest member will introduce three new projects to his audience and he hopes to solicit critical feedback. Kumar also foresees more collaborations with CNS and other CSE faculty. With CSE Prof. Kamalika Chaudhuri, he is already collaborating on the issue of differential privacy for machine learning. He is also working with two other CNS members: CSE Prof. Tajana Rosing, on understanding the tradeoffs facing machine-learning algorithms in the Internet of Things; and CSE Prof. Ranjit Jhala, on applying program analysis to bring new data-driven optimizations to advanced analytics codebases. As for other collaborators in CSE, Kumar is collaborating with CSE Prof. Lawrence Saul and fellow new hire, CSE Prof. Ndapa Nakashole, on using speech recognition to improve database usability.
“A couple of my upcoming projects will involve working on top of popular, distributed machine learning and data-processing systems such as Spark and TensorFlow to exploit the massive parallelism they offer for new abstractions that I create,” said Kumar. “I suspect this will eventually get me digging into the internals of these networked systems and perhaps optimizing them for the workloads that I care about. This could involve publishing with CNS co-authors, so becoming a member of the center seemed a no-brainer.”
Kumar wants to make it easier and faster to build and use ML algorithms to analyze large and complex datasets. “My work over the next few years is going to focus on building tools, software and abstractions to make it easier to use machine learning in practice,” he predicted. “I want to do so from the perspective of the data scientist’s productivity, the runtime performance and research efficiency, as well as other issues such as privacy.”
Kumar notes that systems and ideas based on his dissertation and research at UW-Madison have been released as part of the MADlib open-source library, used internally by Facebook, LogicBlox and Microsoft, and shipped in products from EMC, IBM, Oracle and Cloudera. “It’s been nice to work with industry about the practical applications of my work,” he noted. ““The practical relevance of my work can impact what people do today and from them I can learn what the challenges tomorrow will be, and how we as computer-science researchers can stay one step ahead by anticipating what comes next.”
Kumar’s dissertation focused on training machine learning models based on data sets from multiple tables. “Data scientists usually combine all these tables into a massive single table,” he said. “These operations are called relational joins, and specifically key-foreign-key joins. Now the single table contains all the attributes of all the tables. This was the state of the art before I looked at this problem.”
Yet as Kumar confirmed, joining multiple tables together introduces redundancy into the data. “Consider a popular application of machine learning in enterprise domains: predicting customer churn,” he suggested. “You have a customers table joined with, say, a table about employers and another table about areas indexed by zip code. You could have a thousand customers employed by the same company, which means the record with the employer’s attributes (called its feature vector), gets repeated a thousand times after the join. The same could happen with the zip codes.” Result: the output of this join could be several times bigger than the input data. In one case at Microsoft, Kumar recalls, once they joined all their input tables for a Web security-related ML task to make one massive table, it blew up by a factor of ten. “A task that should have taken half an hour ended up taking a whole day on the cluster because the team overshot the storage space allotted to them — bringing down the shared cluster,” observed Kumar. “So storage becomes a major issue, as does the extra time wasted by the redundant computations performed by an ML algorithm over the redundant data.”
Kumar’s dissertation came up with two orthogonal new techniques. The first technique, called ‘avoiding the join physically,’ pushes down the machine learning computation to the input data in a multi-table format rather than having a single table with all the attributes. The challenge was to do so without affecting the accuracy of the ML model’s predictions. “That is a guarantee we provide and we have a proof for it,” confirmed Kumar. “Weff proved that the accuracy is unaffected. This mitigates the storage issue, because you don’t need the single table, and it mitigates the maintenance issue because you operate on the data as-is, and it mitigates the performance issue because you save a lot of runtime when you operate on the smaller input of the joins.”
One additional benefit of Kumar’s new paradigm: “Today many of the computations for machine learning happen in the cloud,” he said. “You purchase storage or computation runtime, and by reducing both, users can save a lot of money as well.”
The second part of his thesis focused on omitting unnecessary tables. “We showed that in many settings, for many ML models, some tables can be completely ignored,” explained Kumar. “We call it ‘avoiding the join logically’ because we are pretending that a table doesn’t even exist. If you omit a table, your runtime goes down, your storage goes down, and the data scientist’s productivity can go up because you have fewer tables and fewer attributes to manage.”
Kumar showed that prediction accuracy without the omitted table not only does not go down, but the runtime accelerates by two orders of magnitude – i.e., making the computation up to 100 times faster.
Among his many honors, Kumar received a 2016 Google Faculty Research Award, and the same year took home a graduate student research award from the University of Wisconsin for his dissertation research. He was also a recipient of the Best Paper award at SIGMOD 2014.
Kumar recognizes that he joined UC San Diego at an important turning point for anyone working in the general field of data science. CSE is about to launch its first major and minor in Data Science and Engineering, and the campus is developing a Data Science Institute thanks to a $75 million gift from CSE lecturer and alumnus Taner Halicioglu, announced last week. “I am excited that UC San Diego is taking data science seriously,” mused Kumar. “Democratizing data science is a grand challenge that transcends disciplines and requires bridging the gaps between the fields of data management, systems, machine learning, statistics, math, human-computer interaction, and several other fields, including myriad application domains. The generous gift from our alumnus is truly spectacular and I hope it will help accelerate UC San Diego’s research and education in this important area.”
Meantime, Kumar will focus on his teaching and research, and recruiting graduate students for his lab. Two M.S. students from his Winter 2017 course on advanced analytics are now working as research assistants in his group. “I had set a tough filter for enrollment: reviewing a research paper and answering some open-ended research questions,” he said. “This seems to have scared away many students but it ensured a high-quality atmosphere in class. Some of the students even managed to submit research papers on their course projects, one to KDD and another to a SIGMOD workshop, which has already been accepted, while two more are working on solidifying their work for submission to VLDB/SIGMOD. These are all top venues in this research area.”
In addition to teaching the undergraduate course on implementing relational database management systems, this Spring Kumar is also organizing a CSE 290 seminar for grad students on Advanced Data Science. For the seminar, students will read and present papers and articles on advanced data science applications and tools.
Arun Kumar Website
Computer Science and Engineering, University of California San Diego
CSE 190 Topics in Database System Implementation
CSE 290 Seminar on Advanced Data Science
CSE 291 Topics in Advanced Analytics
- Computer Scientists Honored for ‘Tracing’ Research That Stood 10-Year Test of Time
Faculty from UC San Diego, Brown University, and UC Berkeley Share in Networked Systems Award
At the USENIX Symposium on Networked Systems Design and Implementation (NSDI) this week in Boston, Mass., a team of researchers accepted an award for the most influential paper among those presented a decade ago at the annual conference. The 2017 NSDI Test of Time Award was presented during a luncheon on March 26 to two former graduate students at UC Berkeley who co-authored the paper published at NSDI 2007, along with their three UC Berkeley advisors.
Rodrigo Fonseca and George Porter are now professors of computer science, respectively, at Brown University and theUniversity of California San Diego. They accepted the award for their paper*, “X-Trace: A Pervasive Network Tracing Framework,” along with one of their former advisors, professor Ion Stoica. (Other co-authors on the paper – UC Berkeley professors Randy H. Katz and Scott Shenker – did not attend the award ceremony.)
Porter and Fonseca were still at UC Berkeley when they worked on the original paper. “We wrote X-Trace while we were Ph.D. students,” recalled Porter. “It was really an honor to work with my colleagues on this project, which formed the basis of Rodrigo’s and my Ph.D. dissertations.” Stoica remains a professor of computer science in the Electrical Engineering and Computer Science department of UC Berkeley. (It’s not Stoica’s first Test of Time award: he received the SIGCOMM Test of Time Award in 2011.)
Modern Internet systems often combine different applications, span different administrative domains, and function in the context of network mechanisms (tunnels, VPNs, overlays and so on). In their 2007 paper, the co-authors argued that “diagnosing these complex systems is a daunting challenge.” “Many diagnostic tools existed at the time, but none existed for reconstructing a comprehensive view of service behavior,” said Brown’s Fonseca.
X-Trace was not the first tracing framework, but it was influential given that it was effectively the first framework for end-to-end tracing to focus on generality and pervasiveness. “It was based on the observation that an increasing number of systems would be built from heterogeneous components, built and operated by different people,” explained Fonseca. “In contrast, existing tracing frameworks required a specific language, or were targeted to a particular system.”
The researchers implemented X-Trace in protocols and software systems, and in their prize-winning paper, they set out to explain three different use scenarios: domain name system (DNS) resolution; a three-tiered photo-hosting website; and a service accessed through an overlay network.
Hari Balakrishnan, who co-chaired NSDI in 2007, broke the news of the Test of Time Award to the recipients. “We’re very pleased to share that your X-Trace paper from NSDI 2007 has been selected for an NSDI Test of Time Award,” he wrote. “The award honors a paper published ten years earlier at NSDI with retrospectively the most impact on research or practice.”
Indeed, the X-Trace paper has proved to be prescient – in both research and practice. “Today many Internet-scale backend systems are built using a ‘microservices’ approach, with hundreds of loosely connected components tied together to offer larger services,” noted Porter. “Debugging these systems effectively requires what X-Trace provided: the ability to correlate events in one component to events in other arbitrary components, even if they were many steps far removed from the first.”
The rapid adoption of tracing began with Google’s introduction of Dapper in 2010 (see graphic), which offered a similar primitive to X-Trace. Twitter’s Zipkin and Cloudera’s HTrace were open-source implementations of Dapper. Another current competitor in the market, called Traceview, also has X-Trace in its DNA after a series of startups and acquisitions dating back to 2010.
“By 2015 many companies such as Netflix, Baidu, Uber, Facebook and Etsy were deploying internal trace solutions very similar to our ideas presented in the X-Trace paper,” observed Fonseca. “And the interest persists in a rather recent initiative called OpenTracing, which is trying to standardize end-to-end tracing.”
The NSDI award is not Fonseca’s first for his work on tracing: he co-authored a paper on ‘pivot tracing’ that received a Best Paper award at the 2015 Symposium on Operating Systems Principles. That same year, Fonseca won an NSF CAREER Award for his work on ‘causal tracing’ to elucidate understanding of the performance of distributed systems. (Causal tracing covers a wide variety of tracing systems and frameworks, including X-Trace itself, as well as Dapper, Zipkin, HTrace, and many others.)
“It’s becoming increasingly difficult to understand how a system behaves, and, especially, how and why it fails,” said Fonseca. “Causal tracing is a technique that captures the causality of events across all components, layers and machines, and it eases the task of understanding complex distributed systems.”
Now a co-director of UC San Diego’s Center for Networked Systems (CNS), George Porter’s research encompasses the fields of computer networking, data-intensive computing and computer systems, with a specific focus on data center networking. “I work to reduce the barrier to developing, deploying and managing applications that are able to process massive amounts of data,” said Porter. “At the same time, we aim to ensure that the resulting systems are practical, low-cost and energy efficient.”
Porter also received an NSF CAREER Award (in 2016) for work on a scalable multiplane data center network. He plans to demonstrate a hybrid electrical-optical network topology that will scale to hundreds of thousands of servers – at link rates reaching 1.6 terabits per second.
Meanwhile, the excitement surrounding tracing continues unabated. In 2017, for example, Amazon has released X-Ray, which offers distributed tracing for Amazon Web Services, and another company, Datadog, also released an end-to-end tracing product earlier this year.
*Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica, “X-Trace: A Pervasive Network Tracing Framework , Proc. 4thUSENIX Conference on Networked Systems Design and Implementation (NSDI), April 2007, Cambridge, MA.
- CNS Espresso Prize for Excellence in Networking 2017 Awardee
Every academic year, the Computer Science and Engineering department offers the class CSE 123, Computer Networks. In this class, students are introduced to concepts, principles, and practice of computer communication networks with examples from existing architectures, protocols, and standards. Students are expected to complete a final project showing how they use the concepts they have learned to resolve a problem posed by the instructor.
Dr. George Varghese, a former CSE professor, taught CSE 123 for almost a decade and always enjoyed seeing the many ways that students implemented their final projects. When Dr. Varghese departed from UC San Diego in 2013, he left behind a gift to fund an annual prize to be awarded to the students who produce the best final projects in CSE 123.
The CNS Espresso Prize for Excellence in Networking is awarded by the current professor for CSE 123, Alex C. Snoeren, based upon criteria set by him for the given final project assigned each year. Professor Snoeren awarded the prize this year to UCSD undergraduate student Yihan Zhang for his outstanding final project.
Previous Recipients of the CNS Espresso Prize for Excellence in Networking:
2016 Undergraduate recipient: Conner Johnston
2014 Undergraduate recipient: Aaron Yip Ming Wong
2014 Visiting Undergraduate recipient: Matheus Venturyne Xavier Ferreira
2013 Undergraduate recipient: Jacob Maskiewicz
2013 Graduate recipient: Vidya Kirupanidhi
- Using Batteries to Cut Utility Costs
CNS postdoctoral researcher Alper Sinan Akyurek developed an algorithm for controlling batteries that can decrease the utility cost of an actual building by up to 50 percent compared to a building powered without the use of batteries.
Akyurek (Ph.D. ’17) – who completed his doctorate in January – still works in the Systems Energy Efficiency Laboratory of CSE Prof. Tajana Rosing (who has an adjunct appointment in Electrical and Computer Engineering, Akyurek’s previous department). Together they published their findings in a paper on “Optimal Distributed Nonlinear Battery Control” in the December 2016 issue of the IEEE Journal of Emerging and Selected Topics in Power Electronics*.
As the researchers noted in their article, energy storage systems enable on-demand dispatch of energy to compensate for volatility in the generation and consumption — supply and demand — for power. “Our optimal distributed battery control handles multiple batteries with low computational complexity,” they noted.
Compared to previous work, they used a higher-accuracy nonlinear battery model with only two percent error. “We show in a case study that optimal algorithms designed for a linear battery model induce an error of up to 60 percent in terms of cost reduction… [while] for the case of a constant load profile, we show that this error exceeds 150 percent,” said Akyurek.
Comparing the latest algorithm to the state-of-the-art load-following battery management technique, the new algorithm produced a 30 percent improvement in utility cost. Furthermore, the algorithm obtains the solution for multiple batteries in a decentralized way with guaranteed convergence.
Funding for the control research came from TerraSwarm, one of six centers of the Semiconductor Research Corporation’s STARnet program funded by the Defense Applied Research Projects Agency (DARPA), Microelectronics Advanced Research Corp. (MARCO) and DARPA-E (for Energy). SRC is backed by companies including Intel, IBM, Micron and Texas Instruments. Professor Rosing co-led TerraSwarm’s Smart Cities effort, on which Akyurek worked for three years until it ended in October 2015.
Akyurek’s primary research related to CNS involves context-aware optimization in Internet of Things (IoT) systems. His research extends to optimized control in the Smart Grid for energy efficiency, and he has developed a range of control algorithms for purposes ranging from communication and prediction to controlling energy storage.
Prior to his Ph.D. at UC San Diego, the postdoctoral researcher completed his B.Sc. (’08) and M.Sc. (’11) at Middle East Technical University in Ankara, Turkey, where he was a member of its Communication Networks Research Group. Akyurek also worked as a senior design engineer on wireless networks for the Turkish company, Aselsan, Inc., before enrolling at UC San Diego.
Looking to the future, Akyurek hopes to continue his current line of research. “We are working to extend our optimal nonlinear distributed control solution to other areas in the Smart Grid,” he noted. “We want to modify it for use in other Internet of Things ecosystems such as sensor networks, user-in-the-loop control systems, and managing the maintenance of devices.”
*A.S. Akyurek and T. Simunic Rosing, “Optimal Distributed Nonlinear Battery Control”, IEEE Journal of Emerging and Selected Topics in Power Electronics, December 2016.
- Center for Networked Systems Adds New Faculty Members
The Center for Networked Systems (CNS) at the University of California San Diego now has 22 faculty membersfollowing the addition of two new professors to its ranks. Both newcomers – Deian Stefan and Aaron Schulman – joined the Computer Science and Engineering (CSE) faculty as assistant professors recently, with Stefan starting to teach last fall, and Schulman this winter.
“Professors Schulman and Stefan both work in the systems area, but their research interests also go well beyond networked systems,” said CNS co-director George Porter. “Both share an interest in secure systems. Schulman’s interests extend to embedded systems and even operating systems, and Stefan’s other major research focus is on programming languages. Both have a lot to bring to CNS’s research agenda.”
While still doing a postdoc at Stanford, Aaron Schulman founded a company called Mellow Research, LLC, to build BattOr, a power monitor he invented to track how much energy different features of applications use while running on mobile phones. For his part, Deian Stefan delayed his start at UC San Diego by a year to finish launching a web security startup called Intrinsic (formerly GitStar), in which he continues to hold the part-time job of Chief Scientist. “At Intrinsic we’ve been transferring research into practice by building systems, tools and languages that ultimately make it easier for developers to build and deploy Node.js web applications with minimal trust,” said Stefan.
Both Stefan and Aaron Schulman came to UC San Diego from Stanford University. Stefan earned his Ph.D. in Computer Science in 2015, while Schulman was a postdoctoral researcher from 2013 to 2016 in the lab of Stanford professor Sachin Katti. Schulman earned his Ph.D. from the University of Maryland, College Park, in 2013 (with a thesis on the reliability of Internet last-mile links that later won him the SIGCOMM Doctoral Dissertation Award).
According to Stefan, his primary research interest is in “building principled and practical secure systems.” He builds browsers and language runtime systems by applying programming language techniques and analysis. Among the secure systems Stefan has also helped to build: a secure package manager; a browser confinement system designed for modern web applications; a security-centric framework for building web platforms; a dynamic information flow control system; and a programming language for writing secure, constant-time code.
The professor serves as editor of the COWL specification, and he participates more broadly in developing specs as a member of the W3C WebAppSec and Node.js Security working groups. “By working on specifications,” said Stefan, “we’re trying to broadly influence browser and runtime systems that will ultimately make the web a safer place.”
Schulman started on July 1, 2016, but delayed making the move from Palo Alto until late in the year. As of this winter, he is teaching his first course at UC San Diego — a graduate-level course on topics in mobile computing and communication (CSE 291).
In his syllabus for the course, Schulman notes that students are learning about the challenges facing smartphones, wearables and smart devices that have overtaken PCs as the dominant platform for computing and communication. “Mobile devices have severely constrained energy capacity, their network connectivity is exclusively provided by unreliable, bandwidth-constrained wireless links, and they carry a standard set of sensors that are seemingly insufficient for certain applications and also can inadvertently leak private information about their users,” explained Schulman. “We discuss research that addresses the challenges introduced by the mobile platform by blurring the lines between traditional research areas in computer science.”
In past work, Schulman has improved the efficiency of wireless networks, cellular network flexibility, and the energy efficiency of mobile applications. He also quantified residential Internet network reliability, made progress in securing the web’s public key infrastructure, and identified privacy leaks in mobile devices.
- Former CSE/CNS Professor Elected to National Academy of Engineering
Former UC San Diego computer science and engineering and Center for Networked Systems professor George Varghese has been elected to membership in the National Academy of Engineering. He is among the 84 new U.S. members (and 22 foreign members) elected to the organization in 2017. Varghese was cited for his contributions to “network algorithmics that make the Internet faster, more secure, and more reliable.”
Varghese — who was on the UC San Diego faculty from 2000 to 2012 — is currently a Chancellor’s Professor in the Department of Computer Science at UCLA. He returned to the University of California in August 2016, roughly four years after stepping down from his full professorship at UC San Diego to work for Microsoft Research in Silicon Valley.
More than a decade ago, while still at UC San Diego, Varghese took a leave of absence in 2004 to co-found NetSift, Inc., with his Ph.D. student Sumeet Singh (Varghese as president, Singh as NetSift’s chief scientist). The company developed automated techniques for learning and detecting attack signatures. Barely one year later, NetSift was acquired by Cisco Systems in 2005, and Varghese extended his faculty leave to help Cisco transition the NetSift technology to a 20-Gigabit-per-second chip called Hawkeye. (Singh went on to work for Cisco for seven years.) CNS co-director Stefan Savage co-authored some of the early work on the NetSift technology, as did Varghese’s Ph.D. student Cristian Estan, who is now at Google.
Among Varghese’s honors, he received the Koji Kobayashi Award for Computers and Communications in 2014 for his work in network algorithmics and its applications to high-speed packet networks. The same year, he received the SIGCOMM Lifetime Award for “sustained and diverse contributions to network algorithmics, with far-reaching impact in both research and industry.”
Varghese completed his Ph.D. at MIT in 1993, after doing his Master’s degree at North Carolina State. He did his undergraduate work at the Indian Institute of Technology (IIT) Bombay, which awarded Varghese its Distinguished Alumnus Award in 2015. In 2002 he was elected a Fellow of the ACM.
- CNS Invites Applications for Second Alan Turing Memorial Scholarship; Feb. 6 Deadline
The Center for Networked Systems (CNS) in UC San Diego’s Jacobs School of Engineering is once again looking for an undergraduate student who is interested in networked systems – and also active in supporting the LGBT community. “Our goal is to use this scholarship to further boost diversity and inclusiveness in the field of systems and networking and give undergraduates an opportunity to work on top-notch research projects before they get to grad school,” said CNS co-director George Porter, a professor in the Computer Science and Engineering department.
CNS has invited undergraduates to apply for its Alan Turing Memorial Scholarship for the 2017-2018 academic year. The scholarship will be awarded this spring to a student majoring in a field that touches on networked systems, including computer science, computer engineering, public policy, communication or related programs.
According to Porter, CNS will give preference to “students with demonstrated academic merit, financial need and experience or interest in research.”
All applications must be submitted through the online application at https://ucsd.academicworks.com/ . Anyone with questions about the application process can get more information through the UC San Diego Scholarship Office by emailing to firstname.lastname@example.org . The application deadline is no later than Monday, February 6, 2017.
In addition to the $10,000 scholarship, the recipient will have the opportunity to carry out guided research under the direction of one of CNS’s faculty mentors.
The scholarship pays homage to Alan Turing, the British mathematician and founder of the computer science field whose code-breaking work contributed substantially to the Allied victory in World War II (notably by breaking Germany’s Enigma code). Turing’s brilliant career was tragically cut short after the war, when he suffered outright persecution for his activities as a gay man. He died by suicide in 1954.
CNS is also making it easier for alumni, staff and other potential donors to give to the Alan Turing Memorial Scholarship fund with an outright gift or a payment pledge. Donations can be made online through the UC San Diego Online Giving portal. To give to the scholarship program, make your gift online at https://giveto.ucsd.edu/make-a-gift?id=a6a587f2-5000-4ca5-b643-ca84554e61bd&ct=t .
The first recipient of the $10,000 scholarship, Valeria Gonzalez, received the award last spring for the 2016-2017 academic year. “It’s great to see the CNS is taking the initiative to highlight the importance of bringing diversity to computer science and engineering beyond ethnicity and the gender binary,” said Gonzalez on receiving the inaugural award. “The LGBT community encompasses people with an array of talents and abilities, people such as Alan Turing himself… and knowing that your LGBT identity is acknowledged and accepted not only lets you direct all your focus into working hard but also allows you to connect more with the community you’re part of.” A transfer student from Cypress College, a community college near Los Angeles, Gonzalez has been an undergraduate student researcher in the Integrated Electronics and Biointerfaces Laboratory of Electrical and Computer Engineering professor Shadi Dayeh. She has also been a leader in the UC San Diego Women’s Center, which promotes an inclusive and equitable campus community through the educational, professional and personal development of diverse groups of women.
- CNS at NSDI 2017: Innovating in Networked Systems
Researchers affiliated with the Center for Networked Systems (CNS) at the University of California San Diego have been selected to present some of their most up-to-date research at the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2017).
NSDI focuses on the design principles, implementation and practical evaluation of networked and distributed systems. The annual conference will take place March 27-29, 2017, in Boston, MA, and four papers with co-authors from CNS and the Computer Science and Engineering (CSE) department of the Jacobs School of Engineering have been accepted for submission to the prestigious meeting.
CNS co-director George Porter co-authored two of the papers. “NSDI is one of the most important conferences for us, because just like CNS, the symposium brings together researchers from across the networking and systems community,” said Porter. “Our papers accepted to the 2017 symposium are in line with NSDI’s stated goal of pushing architectural boundaries of network services, and promoting the research dialogue on networked systems.”
CSE Ph.D. student Michael Wei and CSE professor Steven Swanson have co-authored with VMware Research (where Wei is currently a researcher) and Princeton University a paper on “vCorfu: Large-Scale Data Stores over a Shared Log.”
vCorfu is a strongly consistent, cloud- scale object store built over a shared log. It augments the traditional replication scheme of a shared log to provide fast reads, and vCorfu leverages a new technique – composable state machine replication – to compose large state machines from smaller ones. “This enables the use of state machine replication to be used efficiently in huge data stores,” said Wei. “We will show that vCorfu outperforms Cassandra, the popular, state-of-the-art NoSQL database for cloud apps It does so while also providing strong consistency in opacity and read-own-writes, efficient transactions, and global snapshots at the scale of the cloud.”
vCorfu is available as an open-source project on Github at github.com/CorfuDB.
Datacenter Fault Detection
CSE Ph.D. student Arjun Roy expects to complete his doctorate in 2017, and he collaborated with his advisor, CSE professor Alex C. Snoeren, on the paper to be presented at NSDI on “Passive Realtime Datacenter Fault Detection.” It reflects joint work with Facebook researchers Hongyi Zeng and Jasmeet Bagga, who are also co-authors on the paper. (The two Facebook engineers previously co-authored a paper at SIGCOMM 2015 with Roy and professors Snoeren and Porter on “Inside the Social Network’s (Datacenter) Network”.) Roy also did internships at Facebook in the summers of 2012, 2013 and 2014.
According to the paper’s abstract, “datacenters are characterized by their large scale, stringent reliability requirements, and significant application diversity. However, the realities of employing hardware with small but non-zero failure rates mean that datacenters are subject to significant numbers of failures, subsets of packets can be dropped or delayed without triggering a fault signal, so traditional fault detection techniques (involving end-host or router-based statistics) may not identify such errors.
In their paper, Roy and Snoeren describe how to expedite the process of detecting and localizing partial datacenter faults. It uses an end-host method generalizable to most datacenter applications. “We correlate transport-layer flow metrics and the delay incurred by network-input/output system calls at end hosts with the path that traffic takes through the datacenter,” said Roy. “Then we apply statistical analysis techniques to identify outliers and localize the faulty link and/or switch or switches.
The paper will detail how the researchers evaluated their novel approach in a production datacenter (Facebook’s) carrying a workload servicing more than 100 million users.
In light of the massive explosion in video content on the Internet and for virtual reality, a team of two CSE Master’s students advised by professor George Porter has come up with a new approach to processing video with minimal delays. Second-year M.S. student Karthikeyan Vasuki Balasubramaniam (who is Porter’s teaching assistant this quarter in CSE 124 on Networked Services) and recent graduate Rahul Bhalerao (M.S. ’16) have had experience in industry (both at Amazon — Balasubramaniam as an intern at Amazon Prime, and Bhalerao currently working at Amazon Web Services).
The paper accepted to NSDI is entitled “Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads.” In it, the researchers describe ExCamera, a system that can edit, transform and encode a video, including ultra-high-resolution 4K video (four times the resolution of high-definition TV) and stereoscopic virtual reality (VR) material, dozens of times faster than cutting-edge production systems at the largest providers.
The co-authors lay claim to two major contributions. First, “our coauthors at Stanford developed a novel encoding strategy focusing on fine-grained parallelism, which is rather unique in the encoding space,” explained Balasubramaniam.
Separately, noted Bhalerao, “ExCamera orchestrates encoding and other video-processing pipelines across the Amazon Web Services Lambda service. The system invokes thousands of threads in parallel, each handling only a fraction of a second of the video.” The UC San Diego was done in collaboration with researchers at Stanford University.
MegaSwitch is a multi-fiber ring optical fabric that exploits space-division multiplexing across multiple fibers non-blocking communications that can be rearranged to 30-plus racks and 6,000-plus servers. CNS’s George Porter co-authored the paper on “Enabling Widespread Communications on Optical Fabric with MegaSwitch” with researchers at the Hong Kong University of Science and Technology, SUNY Buffalo, Yale University as well as Omnisense Photonics and CoAdna Photonics. (No UC San Diego students worked on the paper.)
According to Porter, “we were seeking an optical interconnect that can enable unconstrained communications within a computing cluster of thousands of servers.” Indeed, existing wired optical interconnects are not ideal for widespread communications in production clusters, and recent efforts to reduce the time it takes to reconfigure the optical circuit from milliseconds to microseconds only partially mitigated the problem (by rapidly time-sharing optical circuits across more nodes).
“We were still limited by the total number of parallel circuits available simultaneously,” explained Porter. “However, we wanted to evaluate the potential of WDM to scale to a large number of endpoints.”
USENIX Symposium on Networked Systems Design and Implementation http://www.usenix.org/conference/nsdi17
Computer Science and Engineering Department http://cse.ucsd.edu/about/news/uc-san-diego-center-nsdi-2017-innovating-networked-systems
- KC Claffy among “10 Women to Know in Networking/Communications”
CNS faculty member and principal investigator/founding director of the Center for Applied Internet Data Analysis (CAIDA) at the San Diego Supercomputer Center (SDSC), KC Claffy, has been named to the second annual “10 Women in Networking/Communications That You Should Know” list.
Now in its second year, the list is compiled and coordinated by N2 Women (Networking/Networking Women), a discipline-specific community for researchers in the communications and networking research fields. The organization’s main goal is to foster connections among under-represented women in computer networking and related research fields. The full list of this year’s award recipients can be found here.
Nominations are solicited both from the N2Women community as well as through several mailing lists related to networking and communications. More than 150 people from around the world submitted nominations, resulting in over 140 distinct names of accomplished women in the field, according to the organization.
A committee of five N2 Women board members selected this year’s 10 honorees. “Many people from around the world submitted one or more nominations for this list, and it was very difficult to choose only 10 amazing women,” said Oana Iova, a postdoctoral researcher in the D3S research group with the Department of Information Engineering and Computer Science (DISI) at the University of Trento, Italy , and the awards co-chair who led the nomination and selection processes this year. “We focused on women who have had a major impact in networking and/or communications. We also wanted a list that reflected presented our diversity, and specifically the diversity in the area of networking/communications.”
“I am honored to join such a distinguished group on this year’s N2 Women’s list,” said Claffy, who founded CAIDA in 1997 as a collaboration among commercial, government and academic research sectors to promote greater cooperation in the engineering and maintenance of a robust, scalable global internet infrastructure. “I encourage other women working in networking and communications to attend or help organize an N2Women event at their next ACM, IEEE, or other relevant conference or workshop.”
Today, CAIDA’s research interests include internet cartography, or detailed analyses of the changing nature of the Internet’s topology, routing and traffic dynamics. CAIDA also investigates the implications of these changes on network science, architecture, infrastructure security and stability, and public policy.
Earlier this year CAIDA was awarded a $1.4 million grant from the U.S. Department of Homeland Security to demonstrate and illuminate structural and dynamic aspects of the Internet infrastructure relevant to cybersecurity vulnerabilities. These aspects include macroscopic stability and resiliency analyses, grey markets for IPv4 addressing resources, and on-demand router-level topology inference.
In 2015, Claffy received the IEEE Internet Award for her “seminal contributions to the field of Internet measurement, including security and network data analysis, and for distinguished leadership in and service to the Internet community by providing open-access data and tools,” according to a notice published by the institute .