CSE Students and Professors Stage Major Presence at SIGMOD 2017

CSE had a major presence at this year’s ACM Special Interest Group on Management of Data (SIGMOD), the premier venue for research in data management. The 2017 meeting took place in mid-May in Chicago jointly with PODS, the premier international conference on the theoretical aspects of database systems. CSE/CNS Database Lab faculty Yannis Papakonstantinou, Alin Deutsch, Arun Kumar and postdoctoral researcher Yannis Katsis all served on the SIGMOD research track program committee, and Kumar was a judge for the inaugural SIGMOD Student Research Competition. (He also chaired a Research Track session on Versions and Incremental Maintenance.)

However, it was the research that took center stage, with UC San Diego computer science faculty and students out in force with five major papers in the main conference. CSE/CNS professors Yannis Papakonstantinou and Steven Swanson and Ph.D. students Chunbin Lin and Jianguo Wang (who delivered the paper) presented their research on “An Experimental Study of Bitmap Compression vs. Inverted List Compression.”

Ph.D. student Jianguo Wang delivered the paper on bitmap vs. inverted list compression
Papakonstantinou also had a joint paper with colleagues from Stanford University, Vasilis Verroios and Hector Garcia-Molina. They unveiled “Waldo: An Adaptive Human Interface for Crowd Entity Resolution.”

A newcomer to SIGMOD, CSE professor Kamalika Chaudhuri had two high-profile papers on the agenda. She and fellow CSE/CNS professor Arun Kumar were co-authors on a paper titled “Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics.” Their co-authors were all former colleagues of Kumar at the University of Wisconsin-Madison before he joined the CSE faculty this year. Professor Chaudhuri was also senior author on a paper presented by her Ph.D. student, Shuang Song. It was about “Pufferfish Privacy Mechanisms for Correlated Data”.

 “Kamalika Chaudhuri was one of two people who dominated the SIGMOD data privacy session this year, each of them with two papers in that session,” noted CSE‘s Kumar. “One of her papers, which I think was her first SIGMOD submission, got accepted without any revisions!” Kumar notes that he and Chaudhuri are planning to collaborate on new problems in the data, analytics and privacy space, especially on data cleaning and analytics systems. Database Lab members also invited Chaudhuri to become a member of the lab, and she accepted.

The final CSE-related paper in the main research track was co-authored by Ph.D. student Vineet Pandey, who works in the Design Lab with his advisor, CSE professor Scott Klemmer. The paper on “Concerto: A High Concurrency Key-Value Store with Integrity” recapped research done at Microsoft when Pandey spent a summer there, as did another UC San Diego student (now alumnus) Pingfan Meng (M.S., Ph.D. ’11, ’16), who is listed as a co-author on the paper, and who is now a research scientist at Intel Labs. Microsoft researchers listed as co-authors on the Concerto paper included Arvind Arasu, Ken Eguro, Raghav Kaushik, Donald Kossman and Ravi Ramamurthy, with Arvind delivering the presentation.

Tutorials and Workshops

CSE/CNS professor Arun Kumar

With a big conference like SIGMOD, however, the main sessions are only part of the action. CSE‘s Arun Kumar co-presented a tutorial on systems, techniques and challenges in the space of data management and machine learning. “The tutorial attracted a packed audience with a mix of industry folks, professors and students,” recalled Kumar. “It was well-appreciated and stirred a lot of discussion.” (Slides and video from the tutorial are available on the SIGMOD tutorials page.)

Then there were the workshops co-located with SIGMOD 2017, and professor Kumar was heavily involved in three of them. He presented the invited academic keynote at the First Workshop on Data Management for End-to-End Machine Learning (DEEM). His talk focused on emerging research opportunities and challenges for the data management community in democratizing advanced analytics beyond just building faster/scalable ML algorithm implementations. It was well-attended and well-received by both researchers and practitioners. During the same DEEM Workshop, Kumar also had a joint paper with former colleagues at the University of Wisconsin-Madison (Lingjiao Chen and Paraschos Koutris). The paper explored “Model-based Pricing: Do Not Pay for More than What You Learn!”

Kumar and fellow CSE professor Lawrence Saul also co-authored a paper with graduate students Dharmil Chandarana and Vraj Shah. CSE M.S. student Shah presented the paper on “SpeakQL: Towards Speech-driven Multi-modal Querying” in the Workshop on Human-in-the-Loop Data Analytics (HILDA).
CSE postdoc Yannis Katsis also presented a paper co-authored by professor Papakonstantinou and Ph.D. student Nikos Koulouris during the HILDA workshop. The topic: “Assisting Discovery in Public Health”, which they co-authored with Qualcomm Institute researcher and UC San Diego School of Medicine professor Kevin Patrick.

VLDB 2017

With SIGMOD 2017 now history, Database Lab members are looking ahead to the other major database conference of the year, the International Conference on Very Large Data Bases (VLDB 2017). It’s scheduled for August 28-September 1 in Munich, Germany. CSE‘s database researchers are promising another banner presence for the group at the meeting.