CSI 445/660. Topics in Data Management Systems

Parallel Data Processing Systems

  1. [dean04] ``MapReduce: Simplified Data Processing on Large Clusters, Jeffrey Dean, Sanjay Ghemawat’’, OSDI 2004: 137-150.

  2. [malewicz10] ``Pregel: a system for large-scale graph processing'', Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, SIGMOD Conference 2010: 135-146.

Trajectory Data Management

  1. [cudre-mauroux10] ``TrajStore: An adaptive storage system for very large trajectory data sets’’, Philippe Cudré-Mauroux, Eugene Wu, Samuel Madden, ICDE 2010: 109-120.

Parallel/Distributed Database Systems

  1. [chang06] ``Bigtable: A Distributed Storage System for Structured Data’’, Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber, OSDI 2006: 205-218 (Awarded Best Paper!).

  2. [jones10] ``Low overhead concurrency control for partitioned main memory databases’’, Evan P. C. Jones, Daniel J. Abadi, Samuel Madden, SIGMOD Conference 2010: 603-614.

  3. [thomson10] ``The Case for Determinism in Database Systems’’, Alexander Thomson, Daniel J. Abadi, PVLDB 3(1): 70-80 (2010)

  4. [nehme11] ``Automated partitioning design in parallel database systems’’, Rimma V. Nehme, Nicolas Bruno, SIGMOD Conference 2011: 1137-1148.

  5. [thomson12] ``Calvin: fast distributed transactions for partitioned database systems’’, Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, Daniel J. Abadi, SIGMOD Conference 2012: 1-12.

  6. [zhou12] ``Advanced partitioning techniques for massively distributed computation’’, Jingren Zhou, Nicolas Bruno, Wei Lin, SIGMOD Conference 2012: 13-24.

  7. [kwon12] ``SkewTune: mitigating skew in mapreduce applications’’, YongChul Kwon, Magdalena Balazinska, Bill Howe, Jerome A. Rolia, SIGMOD Conference 2012: 25-36.

Semantic Web

  1. [huang11] ``Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database’’, Jiewen Huang, Daniel J. Abadi, Kun Ren, Proceedings of the VLDB Endowment (PVLDB) 4(11), 1123-1134, 2011.

Graph Data Management

  1. [karypis98] ``A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs’’, George Karypis, Vipin Kumar, SIAM Journal on Scientific Computing 20(1): 359-392 (1998).

  2. [zhang10] ``Discovery-driven graph summarization’’, Ning Zhang, Yuanyuan Tian, Jignesh M. Patel, , In Proceedings of the 26th International Conference on Data Engineering (ICDE), 880-891, 2010.

  3. [ren11] ``On Querying Historical Evolving Graph Sequences’’, Chenghui Ren, Eric Lo, Ben Kao, Xinjie Zhu, Reynold Cheng, PVLDB 4(11): 726-737 (2011) .

  4. [jin12] ``SCARAB: scaling reachability computation on large graphs’’, Ruoming Jin, Ning Ruan, Saikat Dey, Jeffrey Xu Yu, SIGMOD Conference 2012: 169-180.

  5. [fan12] ``Query preserving graph compression’’, Wenfei Fan, Jianzhong Li, Xin Wang, Yinghui Wu, SIGMOD Conference 2012: 157-168.

  6. [guan12] ``Measuring Two-Event Structural Correlations on Graphs’’, Ziyu Guan, Xifeng Yan, Lance M. Kaplan, PVLDB 5(11): 1400-1411 (2012).

  7. [fan12] ``Performance Guarantees for Distributed Reachability Queries’’, Wenfei Fan, Xin Wang, Yinghui Wu, PVLDB 5(11): 1304-1315 (2012).

  8. [cheng12] ``K-Reach: Who is in Your Small World’’, James Cheng, Zechao Shang, Hong Cheng, Haixun Wang, Jeffrey Xu Yu, PVLDB 5(11): 1292-1303 (2012).

Query Processing/Optimization

  1. [vernica10] ``Efficient parallel set-similarity joins using MapReduce’’, Rares Vernica, Michael J. Carey, Chen Li, In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD), 495-506, 2010.

  2. [herodotou11] ``Query optimization techniques for partitioned tables’’, Herodotos Herodotou, Nedyalko Borisov, Shivnath Babu, SIGMOD Conference 2011: 49-60.

  3. [elmore11] ``Zephyr: live migration in shared nothing databases for elastic cloud platforms’’, Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, Amr El Abbadi, SIGMOD Conference 2011: 301-312.

  4. [cheung12] ``Automatic Partitioning of Database Applications’’, Alvin Cheung, Owen Arden, Samuel Madden, Andrew C. Myers, PVLDB 5(11): 1471-1482 (2012).

Memory Database Systems

  1. [grund10] ``HYRISE - A Main Memory Hybrid Storage Engine’’, Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudré-Mauroux, Samuel Madden, PVLDB 4(2): 105-116 (2010).

  2. [kemper11] ``HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots’’, Alfons Kemper, Thomas Neumann, ICDE 2011: 195-206.

High Availability

  1. [yang10] ``Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database’’, Christopher Yang, Christine Yen, Ceryen Tan, Samuel Madden, In Proceedings of the 26th International Conference on Data Engineering (ICDE), 657-668, 2010.

  2. [johnson10] ``Aether: A Scalable Approach to Logging’’, Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, Anastasia Ailamaki, PVLDB 3(1): 681-692 (2010).

  3. [cao11] ``Fast checkpoint recovery algorithms for frequently consistent applications’’, Tuan Cao, Marcos Antonio Vaz Salles, Benjamin Sowell, Yao Yue, Alan J. Demers, Johannes Gehrke, Walker M. White, SIGMOD Conference 2011: 265-276.

Human in the Loop

  1. [parameswaran11] ``Human-assisted graph search: it's okay to ask questions’’, Aditya G. Parameswaran, Anish Das Sarma, Hector Garcia-Molina, Neoklis Polyzotis, Jennifer Widom, Proceedings of the VLDB Endowment (PVLDB) 4(5), 267-278, 2011.

  2. [franklin11] ``CrowdDB: answering queries with crowdsourcing’’, Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold Xin, In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD), 61-72, 2011.