Database System

Course Code: B63005H

Course Name: Database System (B)

Credits: 2.0

Level: Undergraduate

Pre-requisite: Data Structure, Programming, Principles of Computer

Lecture Time: 10 weeks, 2 sessions/week, 2 hours/session

Instructors: Dr. Shimin Chen

Course Description

Database system is the core technology in the age of big data. Student should master the following basic knowledge through the study of this course. (1) Model and theory of relational database; (2) Use of database and design of database application; (3) The classic data structures and algorithms in database system; (4) Characteristics and key technologies of different types of database system. At the same time, we hope that students can use the real system to solve practical problems and cultivate their practical ability through programming experiments, thus having a more profound understanding of class content. This course will lay a solid foundation, when students are engaged in scientific research, application development, system optimization, data analysis in the direction of database system and big data processing.

Topics and Schedule

  1. Description of Database System (2 hrs)

1.1.   Origin and development of database

1.2.   Relationship between database and operating system

1.3.   Relationship between database and file system

1.4.   Relationship between database and data storage

1.5.   Relationship between database and data mining

  1. Relational Model and Theory (10 hrs)

2.1.   E-R model

2.2.   Integrity constraints

2.3.   Functional dependency

2.4.   Normalized paradigm

2.5.   Data view

2.6.   Relational algebra and set-representation

2.7.   Selection, projection and join

2.8.   Relational algebra expression ability

2.9.   Atomicity, consistency, isolation and durability (ACID)

  1. Structured Query Language (6 hrs)

3.1.   Origin and development of structured Query Language

3.2.   Combination relationship model and algebra

3.3.   SQL, DDL, DML, DCL, TCL and Nested SQL

3.4.   UDF and stored procedure

3.5.   Expression ability of SQL

3.6.   Typical table definition, query and update based on TPC benchmarking

3.7.   Database Systems Lab

  1. Using of Database (Lab1) (2 hrs)

4.1.   Install open source database system (PostgreSQL or MySQL)

4.2.   Use TPC-H dbgen to generate TPC-H database and data loading

4.3.   Write SQL according to requirements

4.4.   Run TPC-H query to test performance

4.5.   Database Systems Lab

  1. Application Design of Database (2 hrs)

5.1.   Commonly used database system (Oracle, Microsoft SQL Server, IBM DB2, MySQL and PostgreSQL)

5.2.   Procedure of database application design

5.3.   JDBC/ODBC programming

5.4.   Web server, application server and db server

Lab 2:

Design an application which based on back-end services of database, it is modeled on the Amazon.com, 12306, CMT/Easychair conference system, Linkedin, or lending system, etc. A project team should have three people to carry on it.

  1. Internal Structure of Database System (1 hrs)

6.1.   Front-end and back-end

6.2.   Syntax parsing and system directory

6.3.   Generation of operation plan

6.4.   Query optimization

6.5.   Data storage, index, buffer pool and query execution

  1. Data Storage and Access (7 hrs)

7.1.   Storage mode (file and partition) and space management

7.2.   Internal structure of data page

7.3.   Line and column storage

7.4.   Insert, delete and modification operations

7.5.   Operation of buffer pool

7.6.   Replacement algorithm

7.7.   Index structure, B+/- Tree and Hash index

Lab 3:

Design to implement a database system for data analysis which support value type, type string data and support the function: select from where group by. A project team should have three people to carry on it.

  1. Data Operation (14 hrs)

8.1.   Storage mode (file and partition) and space management

8.2.   Algorithm of selection, group by, aggregation and distrinct

8.3.   Cost analysis

8.4.   Design of operator pull/push

8.5.   Blocking/non-blocking operator

8.6.   Nested loop join and nested loop index join

8.7.   Sort merge join

8.8.   Hash join

  1. Implementation of Analytical Database System (2 hrs)

9.1.   Relationship between modules in a complete database system

9.2.   Database Systems Lab

  1. Query and Optimization (2 hrs)

10.1.      Statistics and cost analysis

10.2. Equivalent transformation

10.3.      Optimization of operation plan and query

  1. Transaction Processing System (4 hrs)

11.1. Transaction management

11.2. 2-phase locking

11.3. Multi-Version Concurrency Control (MVCC)

11.4. Write-ahead logging

11.5. Crash recovery

  1. Data Warehousing System (2 hrs)

12.1.      Transaction management

12.2.      Online Analytical Processing (OLAP)

12.3.      Data cube

12.4.      Materialized view

  1. Main Memory Database System (2 hrs)

13.1.      Optimization of CPU cache

13.2.      Type of main memory database

13.3. Structure of MonetDB

13.4.      Structure of SAP HANA

  1. Initial Processing of Big Data (4 hrs)

14.1.      Characteristics of big data

14.2. MapReduce

14.3. KV Store

Grading

Three experimental assignments and labs will be given. These will be graded by group, and their scores will count for 60% of the total. At the end of the class it will be followed by a final examination which will count for 40%.

Textbook

Raghu Ramakrishnan, Johannes Gehrke. McGraw-Hill. Database Management Systems. 3rd edition (August 14, 2002)

Reference