CSE 712 Principles of Data Models and Query Languages

Registration #22700

Instructor

Dr. Jan Chomicki, Associate Professor

Time and location

Mon 4:00-7:00, Davis 310 (changed).

Piazza

Class page

Talks

DateTopicsPresenterReferences
Jan 21No classNoneNone
Jan 28Preference QueriesJan Chomicki J. Chomicki: Logical Foundations of Preference Queries. DEBull 2011.
Feb 4Missing Query AnswersYing Yang A. Chapman, H. Jagadish: Why Not?, SIGMOD 2009.
Feb 4RDF Graph MatchingL. B. Kandlakunta L. Zou et al: gstore: Answering SPARQL Queries via Subgraph Matching. VLDB 2011.
Feb 11GraphQLS. S. Saley H. He, A. Singh: Graphs-at-a-Time: Query Language and Access Methods for Graph Databases. SIGMOD 2008.
Feb 11Representative SkylinesN. Meneghetti A. Das Sarma et al: Representative Skylines using Threshold-based Preference Distributions. ICDE 2011.
Feb 18Query Result PresentationKiran Ramachandra B. Liu, H. V. Jagadish: Using Trees to Depict a Forest. VLDB 2009.
February 18Keyword Search and FormsA. Vasudeva E. Chiu: Combining Keyword Search and Forms for Ad Hoc Querying of Databases. SIGMOD 2009.
Feb 25CrowdERVarun Ranga J. Wang et al: CrowdER: Crowdsourcing Entity Resolution. VLDB 2012.
Feb 25CrowdsourcingPatricia Ortega M. J. Franklin et al.: CrowdDB: answering queries with crowdsourcing. SIGMOD 2011.
March 18Entangled QueriesSatyaditya M N. Gupta et al: Entangled Queries: Enabling Declarative Data-Driven Coordination. SIGMOD 2011.
March 25SODAN. B. Karnati L. Blunschi et al.: SODA: Generating SQL for Business Users. VLDB 2012.
March 25Trade-offs in Preference QueriesS. C. Yennamani Ch. Lofi et al: Efficient Computation of Trade-Off Skylines. EDBT 2010
April 1YAGO OntologyS. Mehra F. Suchanek et al: Yago - A Core of Semantic Knowledge, WWW 2007
April 1YAGO2M. C. Vishnubhotla J. Hoffart et al: YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia, AI Journal, January 2013
April 8DeduplicationShilpa Murthy Ayyala Somayajula A. K. Elmagarmid et al: Duplicate record detection: A survey. TKDE 2007.
April 8Reverse Top-K QueriesKalpesh Kagresha A. Vlachou et al: Identifying the Most Influential Data Objects with Reverse Top-k Queries. VLDB 2010.
April 15, 5:00pmOne algorithm to rule them all: One join query at a timeAtri RudraPODS'12 Best Paper Award. (No critique required.)
April 29K*SQLS. V. Adivikolanu B. Mozafari et al: From Regular Expressions to Nested Words: Unifying Languages and Query Execution for Relational and XML Sequences, VLDB 2010.
TBDTemporal GraphsTBD C. Ren et al: On Querying Historical Evolving Graph Sequences. VLDB 2011.
TBDSPARQL Query RelaxationTBD A. Poulovassilis, P. Wood: Combining Approximation and Relaxation in Semantic Web Path Queries, ISWC 2010.
TBDCopying RelationshipsTBD X. L. Dong: Global Detection of Complex Copying Relationships Between Sources. VLDB 2010.
TBDCertain FixesTBD W. Fan et al: Towards Certain Fixes with Editing Rules and Master Data, VLDB Journal 2012 (shorter version: VLDB 2010).
TBDConditional Functional DependenciesTBD W. Fan et al: Conditional functional dependencies for capturing data inconsistencies. TODS 2008 (shorter version: ICDE 2007).
TBDCreating Best PackagesTBD Q. Wan et al: Creating Competitive Products. VLDB 2009.
TBDReverse Query ProcessingTBD C. Binnig et al: Reverse Query Processing. ICDE 2007.
TBDProbabilistic SkylineTBD I. Bartolini et al: The Skyline of a Probabilistic Relation. TKDE 2012.

Summary

The seminar will be focused on current issues in data models and query languages, including graph databases, Semantic Web, data quality and provenance, and preference queries. A tentative list of topics:
  1. Graph databases:
    • query languages
    • query evaluation
    • applications: ontologies, social networks.
  2. Semantic Web:
    • query approximation and relaxation
    • description logics
    • ontologies.
  3. Provenance:
    • basic models: why- and how-provenance
    • semiring representation
    • why-not provenance.
  4. Data quality:
    • data cleaning
    • new classes of integrity constraints
    • crowdsourcing
    • entity resolution
    • copying detection
    • truth, corroboration, inconsistency
    • trust, reputation.
  5. Preference queries:
    • preference learning
    • new applications.
  6. Schema evolution.
  7. Entanglement:
    • entangled queries
    • entangled transactions
  8. Views:
    • query evaluation using views
    • view synthesis

Workload

  1. Prepare and present a talk based on one or more papers from the current database research literature (I will distribute the papers and help with the presentation).
  2. Prepare a report based on the same material.
  3. Attend all the classes
  4. Read each presented paper in advance and send in a brief critique before the presentation.
There will also be presentations by the instructor and/or invited speakers.

Prerequisites

The expectation is that the students have some background in data models, query languages, indexing, query evaluation and optimization, and logic. A graduate course at the level of CSE 562 is sufficient; in other cases, please contact the instructor.

Grading

The seminar is graded S/U and can be taken for 1-3 credits. The requirements are independent of the number of credits.