Dr. Gautam Das

Research

Publications

Professional Teaching Personal

Home

 

CSE 6392 SEC 013

DATA EXPLORATION AND ANALYSIS IN

RELATIONAL DATABASES

 
Spring 2006
 
 

Dr. Gautam Das
Office: 302 Nedderman Hall
Phone: 817 272 7595
Email: gdas@cse.uta.edu

Office Hours: Tue-Thu 1:00-2:00pm (or by appt)

 
   

About the Course

Much of the world’s recorded data is locked up in structured sources such as databases, which are often the propriety information of private corporations and government agencies. Searching and exploring for information within databases is currently very cumbersome - often the data explorer has to know comprehensive query languages (such as SQL), as well as important information on how the data is structured into different tables and columns (the database schema). In recent years, researchers have pondered on the problems of improving the search and exploration capabilities for relational databases. This includes adapting probabilistic and approximate querying methods to improve the scalability of query answering, as well as information retrieval techniques such as relevance ranking and keyword search. This class will explore the recent efforts by researchers in these extremely important and challenging fields. We will read and discuss latest research literature gleaned from premier conferences in databases and information retrieval. It is hoped that this class will spur students to pursuing further research in these areas.

The following is a tentative list of topics which we will attempt to cover:

1. Probabilistic Methods in Databases

            Sampling Methods in Databases: Basics

            Approximate Query Processing

            Processing of Fuzzy/Uncertain Data

2. Unstructured Search in DatabasesKeyword Queries in Databases

            Ranking of Database Query Results 

3. DB and IR integration

            Top-K algorithms

We will cover various topics in breadth, understand the central contributions of these efforts and try and predict future research directions.

Prerequisites

Advanced Algorithms and Database II are the prerequisite courses. However, exceptions will be made on a case by case basis, especially if the student has prior exposure or demonstrates initiative to quickly learn these concepts on his/her own.

Presentations

The actual reading list, consisting of recent research papers, will be selected and finalized by the first week of classes. Each student will present one or more papers (depending on the enrollment) during the semester. Students will participate in class discussions during and after each presentation. Attendance is required.

Project

 Additionally to reading papers, students will have the option of attempting a programming project during the semester. The projects will involve developing portions of information retrieval systems for structured databases based on the techniques suggested in the papers. The projects will also be tested out using real data that the students should get access to. A long-term objective is that the more promising projects will serve as infrastructure/test-beds for students to continue with their research in these areas beyond the course.

Evaluation

 The grade will be based on the paper presentations, class attendance and participation, and performance in the projects.

 

Home | Research | Publications | Professional | Teaching | Personal