Edith Cohen's Research Page
Announcements
Research Interests
I am a computer scientist.
My research is motivated by challenges in managing and mining massive data sets, networks,
and information systems. I am interested in modeling and in designing
solutions which enhance performance and functionality. I prefer a
principled approach and apply and develop tools from algorithms design, statistics,
and optimization. I am
always looking to learn more, do more, and expand my horizons.
In the span of my research career, I developed models and scalable algorithms
in a wide range of areas including query processing and
optimization, content search and delivery, caching, prefetching, routing,
streaming, parallelization, and fundamental graph algorithms.
More recently, I led the development of highly scalable algorithms
for the analysis and mining of massive graphs, such as social and
other interaction graphs. My work facilitates the computation of
centralities, similarities, and influence measures at a scale and
accuracy that were not possible before. I also led the development
of sampling,
sketching, and estimation techniques which optimize tradeoffs between
storage size, computation, and query speed and accuracy. My work
enables the leveraging of much larger data sets than previously
possible.
More details on my research work can be found here:
Google
Scholar,
DBLP
Professional Background
I am a research scientist at Google (Mountain View). I am also a (visiting) full professor at the School of
Computer Science at Tel Aviv University in Israel.
I attended
Tel Aviv University
(Israel)
for my undergraduate studies in math, physics, and
computer science, graduating in 1985, and continued to obtain an M.Sc.
in computer science in 1986, supervised by Michael Tarsi. I then headed to
Stanford, CA, where I
worked towards a Ph.D. in Computer
Science,
with
Andrew Goldberg and Nimrod Megiddo, graduating in 1991.
From 1991 to 2012, I was a member of the research staff, first at the
legendary
AT&T Bell Laboratories , and after a 1997 split, at
AT&T Labs
Research.
In 1997 I also visited
the Computer Science Division at UC Berkeley.
From 2012 to 2014 I was a principal researcher at
Microsoft Research (Silicon Valley).
Surveys
Recent presentations
-
Beyond distinct counting: LogLog composable sketches of frequency
statistics. Given at the
2nd TOCA-SV Day
May 2017
(
pdf ,
pptx )
-
The Magic of Random Sampling: From Surveys to Big Data.
Talk
Given at Harvard IACS seminar
November 2016.
(
pdf ,
pptx )
-
Monotone Estimation Framework and Applications for Scalable Analytics of Large Data Sets.
Given at "Learning, Algorithm Design and Beyond Worst-Case Analysis" workshop held at the Simons Institute
November 2016.
(
pdf ,
pptx)
-
Weighted Sampling for Scalable Analytics of Large Data Sets.
Talk
Given at Women in Theory (WIT)
workshop held at the Simons Institute
May 2016.
(
handouts pdf ,
animated pdf)
-
New Tools for Scalable Weighted Sampling: Frequency Capping,
Multi-Objective, and more. Given at
Tel-Aviv University CS
Colloquium
October 2015.
(
handouts pdf ,
animated pdf)
-
Scalable Mining of Massive Networks: Distance-based Centrality,
Similarity, and Influence. Given at
Tel-Aviv University CS
Colloquium and at the RAIN seminar at Stanford. January 2015.
(
pptx ,
pdf)
-
Computing Classic Closeness Centrality, at Scale. Given at
Stanford Theory Lunch (January 2015) and at Yahoo NYC and IBM
Almaden (November 2014).
(
pptx ,
pdf)
Students
I was fortunate to work with some great students and interns over
the years.
Teaching
Patents
-
Method and Apparatus for Estimating Transitive Closure and Reachability.
(issued May 12, 1998: US005752241 ).
-
Detecting the Sub-Rate of A Punctured Data Packet for a Multi-Rate
Transmission Scheme. Joint with H.-L. Lou. (issued August 29, 2000: US06111912 ).
-
Improved Retrieval System and Method. Joint with David D. Lewis
(issued September 7, 1999: US05950189 )
-
Method and Apparatus for Improving End to End Performance of a Data Network.
Joint with B. Krishnamurthy and J.L. Rexford
(issued December 11, 2001: US6330561)
-
Method for Preconnecting to a Server on a Network.
Joint with H. Kaplan and U. Zwick.
(issued August 12, 2003 US6606645)
-
Method and Apparatus for Improving End to End Performance of a Data Network.
Joint with B. Krishnamurthy and J.L. Rexford
(issued June 15, 2004: US6751608)
-
Method and Apparatus for Efficient Routing of
Variable Traffic.
Joint with D. Applegate
(issued June 17, 2008: US7388842)
-
Method and Apparatus for Improving End to End Performance of a Data Network.
Joint with B. Krishnamurthy and J. Rexford
(issued Febraury 2, 2010: US7657553)
-
Algorithms and Estimators for Accurate Summarizations of Unaggregated Data Streams.
Joint with N. Duffield, H. Kaplan, C. Lund, and M. Thorup
(issued June 29, 2010: US7746808 )
-
Algorithms and Estimators for Summarizations of Unaggregated Data Streams.
Joint with N. Duffield, H. Kaplan, C. Lund, and M. Thorup
(issued July 27, 2010: US7764625 )
-
Sampling and Analyzing Packets in a Network.
Joint with C. Lund, N. Duffield, A. Gerber, A. Hersh, A. Spatscheck, M. Thorup, and F. True
(issued December 14, 2010: US7852785 )
-
Methods and Apparatus to Bound Network
Traffic Estimation Error for Multistage Measurement Sampling and
Aggregation.
Joint with C. Lund, N. Duffield, and M. Thorup
(issued August 3, 2011: US7990982 )
-
Variance-Optimal Sampling-Based Estimation of Subset Sums.
Joint with N. Duffield, H. Kaplan, C. Lund, and M. Thorup
(issued August 23, 2011: US8005949 )
-
Method And Apparatus For Efficient Routing Of Variable Traffic.
Joint with D. Applegate (issued October 4, 2011:
US8031635 )
-
Method and Apparatus for Improving End to End Performance of a Data Network.
Joint with B. Krishnamurthy and J. Rexford
(issued November 29, 2011: US8069150)
-
Systems, Devices, and/or Methods for Determining Dataset Estimators.
Joint with H. Kaplan
(issued March 20, 2012: US8140539)
-
Systems, Devices, and/or Methods for Managing Data.
Joint with H. Kaplan
(issued April 24, 2012: US8166047)
-
Method for summarizing data in unaggregated data streams.
Joint with N. Duffield, H. Kaplan, C. Lund, M. Thorup
(issued June 5, 2012: US8195710)
-
Cache validation using smart source selection in a data network.
(issued Febraury 11, 2014: US8650266 B2)
-
Method and apparatus for processing of top-k queries from samples.
Joint with H. Kaplan and Nadav Grossaug
(issued April 22, 2014: US8706737)
-
Method and systems to estimate query responses based on data set sketches.
Joint with H. Kaplan
(issued May 27, 2014: US8738618)
-
Method and apparatus to sample data connections.
Joint with C. Cormode and N. Duffield
(issued August 25, 2015: US9116958)
-
Estimating influence using sketches.
Joint with D. Delling, T. Pajor and R.F. Werneck
(issued September 13, 2016: US9443034)
-
Performing graph operations using historic inverse probability estimators
(issued June 27, 2017: US9690827)
Program Commitees
-
STOC 2019
(The ACM Symposium on Theory of Computing, Phoenix, 2019)
-
SIGMETRICS
2019
(ACM SIGMETRICS International Conference on Measurement and Modeling
of Computer Systems, Phoenix, 2019)
-
KDD 2018
(24th SIGKDD Conference on Knowledge Discovery and Data Mining,
London, 2018)
-
SIGMETRICS
2018
(ACM SIGMETRICS International Conference on Measurement and Modeling
of Computer Systems, Irvine, 2018)
-
PODS 2018
(ACM SIGMOD/PODS International Conference on Management of Data,
Houston, 2018)
-
KDD 2017
(23rd SIGKDD Conference on Knowledge Discovery and Data Mining,
Halifax, 2017)
-
STOC 2017
(The ACM Symposium on Theory of Computing, Montreal, 2017)
-
STOC 2015
(The ACM Symposium on Theory of Computing, Portland, Oregon, 2015)
-
PODS 2015
(The ACM SIGMOD/PODS conference, Melbourne, Australia, 2015)
-
ICDT 2015
(The International Conference on Database Theory, Brussels, Belgium, 2015)
-
ESA 2014
(European Symposium on Algorithms, Wroctaw, Poland.)
-
STACS 2014
(Symposium on Theoretical Aspects of Computer Science, Lyon, France.)
-
SODA 2014
(Symposium on Discrete Algorithms, Portland, OR, USA.)
-
PODC 2012
(31st Annual ACM SIGACT-SIGOPS Symposium on
Principles of Distributed Computing, Madeira, Portugal)
-
ICDT 2012
(15th International Conference on Database Theory, Berlin, Germany.)
-
ICALP 2011
(International Colloqium on Automata, Languages, and Programming, Zurich, Switzerland.)
-
PODC 2011
(Symposium on Principles of Distributed Systems, San Jose, CA, USA.)
-
SIGMETRICS 2011
(International Conference on Measurement and Modeling og Computer Systems, San Jose, CA.)
-
ICDT 2011
(14th International Conference on Database Theory, Uppsala, Sweden.)
-
SODA 2011
(Symposium on Discrete Algorithms, San Francisco, CA, USA.)
-
ACM SIGMETRICS 2010
(the International Conference on Measurement and Modeling of Computer Systems, New York, NY, USA.)
-
CSR 2010
(The 5th International Computer Science Symposium, Kazan, Russia.)
-
ALENEX 2010
(SIAM workshop on algorithm engineering and experiments, Austin, Texas, USA)
-
ICDT 2010
(13th International Conference on Database Theory, Lausanne, Switzerland)
-
VLDB 2009
(35th International conference on Very Large Data Bases, Lyon, france)
-
ACM SIGMETRICS 2009
(the International Conference on Measurement and Modeling of Computer Systems, Seattle, WA, USA.)
-
VLDB 2008
(34th International Conference on Very Large Data Bases, Auckland, New Zealand.)
-
IPDPS 2008
(22nd IEEE International Parallel and Distributed Processing Symposium, Miami, FL, USA)
-
VLDB 2007
(33rd International Conference on Very Large Data Bases, Vienna, Austria.)
-
ACM SIGMETRICS 2007
(the International Conference on Measurement and Modeling of Computer Systems, San Diego, CA, USA.)
-
AAA-IDEA 2006
(International workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, Pisa, Italy.)
-
AAA-IDEA 2005
(International workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, Orlando, Florida.)
-
AAIM2005
(1st Annual International Conference on Algorithmic Applications in Management, Xi'an, Shaanxi, China.)
-
CAAN2004
(Workshop on Combinatorial and Algorithmic Aspects of Networking, Banff International Research Station, Canada)
-
SIGMOD 2004
(2004 ACM SIGMOD International Conference on Management of Data,
Paris, France)
-
PODC2004
(23rd Annual ACM SIGACT-SIGOPS Symposium on Principles Of
Distributed Computing, St. John's, Newfoundland, Canada)
-
WWW2004
(The 13th International World Wide Web conference,
New York, New York)
-
ICDCS 2004
(The 24th International Conference on Distributed Computing Systems,
Tokyo, Japan)
-
ESA 2003
(The 11th Annual European Symposium on Algorithms,
Budapest, Hungary)
-
WWW 2003
(The 12th International World Wide Web conference,
Budapest, Hungary)
-
ALENEX 2003
(The 5th Workshop on Algorithm Engineering and Experiments,
Baltimore, MD)
-
FOCS 2002
(The 43rd IEEE Symposium on Foundations of Computer Science,
Vancouver, Canada)
-
WWW2002
(The 11th International World Wide Web conference,
Honolulu, Hawaii)
-
SODA 2K
(The 11th ACM-SIAM Symposium on Discrete Algorithms,
San Francisco, California)
-
FOCS 98
(The 39th IEEE Symposium on Foundations of Computer Science,
Palo Alto, California)
-
ISTCS 96
(The 4th Israeli Symposium on
Theory of Computing and Systems, Jerusalem, Israel, June 1996.)
-
FOCS 95
(The 36th IEEE Symposium on Foundations of Computer Science,
Milwaukee, Wisconsin)
Editorial boards
Honors and Grant Awards
Personal Information
Outside of work, I greatly enjoy my family (my
husband Alex and our four lovely boys) and the outdoors.
Links
Edith Cohen, edith (at) cohenwang (period) com
www.cohewang.com
Last Edited: August, 2018