Advanced SQL

profilepxmptee
CS352_CharlesWilliams_IP3.docx

1

RUNNING HEAD: Normalization

2

Normalization

NORMALIZATION

Charles Williams

CS352 Unit 3 IP

Professor Jeffery Karlberg

1/26/2019

Table of Contents Table of Contents………………………………………………………………………………………………………………………………………………………………………………………………..….2 The Database Models, Languages, and Architecture 3 Database System Development Life Cycle 5 Database Management Systems 6 Advanced SQL 11 Web and Data Warehousing and Mining in the Business World 12 References 13

Database management

It is important that a formal design methodology is used as it provides a mathematical approach to coming up with a reliable database that consolidates all the environments that use the database. A design methodology helps as it provides a way in which the whole designing and development can be done with minimum errors. The design methodology helps in identifying the requirements, the specifications and design levels of the database and data warehouse up for development. The planning stage of the consolidated data base is very important as it involves the coming up with plans that will guide the development of the database (Mabogunje, 2015). The plans help in managing quality, time, risks and other related issues that might affect the design and development of the database and eventually the data warehouse.

The three layers of the 3-level ANSI-SPARC architecture include; a physical schema which is responsible for defining how data is to be stored, a conceptual schema which is responsible for indexing and relating data, and the external schema which is responsible for showing how information was presented. The 3-level ANSI-SPARC type of architecture is designed to guard and guide data change. The primary function of the first layer is to define how data is stored. It is important to note that there can be changes in the physical schema and the changes will not affect how external applications will interact with the stored data (Pokorný, 2018). The second layer’s primary function is to provide a consolidated view of a database. The third layer’s primary function is to define richer APIs and it can do so without necessarily having to change the underlying storage mechanisms in place. The 3-level ANSI-SPARC architecture helps in promoting data independence which in turn helps save time in the long run through the conceptual schema which emphasizes data mapping.

Data administrator and database administrator

A data administrator is an individual whose function is to gather data requirements, analyze data as well as design data and classify data types. They two primary roles of a data administrator include; coming up with data standards that will be applied in databases, and coming up with policies that will dictate on data security, data access, data usage, dataflow well as data authorization in an organization. Other minor duties of data administrator include; playing an assistance role by coming up with data resources and allowing for the sharing of data across applications as well as across departments (Tenopir et al., 2015). Basically, Data administrators are primary deal with data flows and data needs in an organization.

On the other hand, a database administrator is an IT officer like individual whose primary responsibilities include; configuring and installing databases and database management systems, linking databases to information systems to help in application operations, ensuring that the database management systems in place are running optimally, checking and monitoring the database performance. The minor duties of a database administrator include ensuring data integrity and ensuring that appropriate data backups are made. (Yilmaz, O'Connor & Clarke, 2015). Basically, a database administrator is primarily supposed to oversee running and operations of database management systems.

I would recommend that my company hires two people, one as a data administrator and the other as a database administrator. The role of the data administrator will be to analyse and organize the data while the role of the database administrator will be to oversee the operations and functions of the database.

Database System Development Life Cycle

Un-normalized form.

Charity ID

Charity Name

Charity Location

POC Name

POC ID

Tel Extn.

Customer ID

Customer Name

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

The above table is an un-normalized relation. First I will list all the attributes, identify the repeating groups then identify the key attributes.

The attributes are; charity ID, charity name, charity location, POC ID, Tel Extn, customer ID, customer name, date contribution started, no of month, date places and expected contribution end.

The key attributes are the POC ID, the customer ID and the charity ID.

The repeating groups are Date Contribution started, Date Places, No of Month, Charity Location and Expected Contribution End.

1st Normal form

Any relation can be said to be in the first (basic) normal form only if it has no repeating groups. To make a relation to be on the first normal form, first you remove all the repeating groups and then do a partitioning to the un-normalized relation above as shown below;

Charity ID

Charity Name

Charity Location

POC Name

Tel Extn.

Customer Name

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

POC ID

Charity Name

Charity Location

POC Name

Tel Extn.

Customer Name

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

Customer ID

Charity Name

Charity Location

POC Name

Tel Extn.

Customer Name

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

Before moving to the second normal form, one has to understand the dependencies in the relations, the functional, the partial and the transitive dependencies.

2nd Normal form.

A relation can be on the 2nd normal form only if it is in the first normal form. Each and every non-key attribute is dependent on the key attribute.

To achieve the second normal form for a relation, one should remove any partial dependencies that are segregating the relation.

N.B any relation that is in First Normal For and does not have a composite key is definitely in Second Normal Form.

Charity ID

Charity Name

Charity Location

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

POC ID

POC Name

Tel Extn.

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

Customer ID

Customer Name

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

3rd Normal form.

Any relation is said to be on the third normal form if it is in the second normal form and each and every non-key attributes is not dependent on all the other non-key attributes.

To achieve this. One has to remove any non-key attribute that are dependent on other non-key attributes. Any transitive dependencies should be removed (Gaikwad et.al, 2017).

Charity_Details.

Charity ID

Charity Name

Charity Location

POC_Details.

POC ID

POC Name

Tel Extn.

Customer_Details

Customer ID

Customer Name

Date_Details

Date

Contribution Started

No of Month

Date Places

Expected Contribution End

Charity ID

POC ID

Customer ID

Boyce-Codd normal form.

We all know that Boyce-Codd Normal Form is an extension of the Third Normal Form and it states that for each non-trivial functional dependency, Y->A, Y has to be a super-key, for example in the relation below;

Customer ID

Customer Name

In the customer relation, customer ID is the super key in the Customer_Details relation.

In rationalization of the results achieved, the Charity ID, the POC ID and the Customer_ ID are the foreign keys.

Normalization is through and ready for designing of a high-quality database that has a better degree of independence (Eessar, 2016).

Vendor

Vendor

Vendor

Product Version

MySQL

Product Version

Oracle 12c, Enterprise

Product Version

Microsoft SQL Server

Desired Feature

Comments

Rating

Weight

Score

Desired Feature

Comments

Rating

Weight

Score

Desired Feature

Comments

Rating

Weight

Score

Vendor Stability

stable

3.5

0.1

0.3

Vendor Stability

stable

3.6

0.1

0.36

Vendor Stability

stable

4.5

0.1

0.45

Operating System Choice and Requirements

IntelCore

4

0.05

0.2

Operating System Choice and Requirements

Linux

4.5

0.05

0.225

Operating System Choice and Requirements

Linux, Microsoft Windows Server, Microsoft Windows

4

0.05

0.2

Ease of Use/Training Requirements

Easy

4.5

0.2

0.9

Ease of Use/Training Requirements

Hard

4

0.2

0.8

Ease of Use/Training Requirements

Easy

3

0.2

0.6

Ease of Data Model Creation

Easy

3

0.15

0.45

Ease of Data Model Creation

Easy

4.5

0.15

0.675

Ease of Data Model Creation

Easy

4.5

0.15

0.675

Distributed Database Support

No

2

0.1

0.2

Distributed Database Support

Yes

4.5

0.1

0.45

Distributed Database Support

Yes

4.5

0.1

0.45

Administration Ease

Easy

4

0.2

0.8

Administration Ease

Moderate

4

0.2

0.8

Administration Ease

Easy

3

0.2

0.6

Cost

$5,500

4.5

0.2

0.9

Cost

$47,500

3.5

0.2

0.7

Cost

$28,000

3

0.2

0.6

Total

3.75

Total

4.01

Total

3.575

3.75 out of 5

4.01 out of 5

3.575 out of 5

Advanced SQL(TBD)

Web and Data Warehousing and Mining in the Business World(TBD)

References

Eessaar, E. (2016). The database normalization theory and the theory of normalized systems: finding a common ground. Baltic Journal of Modern Computing4(1), 5.

Gaikwad, A. S., Kadri, F. A., Khandagle, S. S., & Tava, N. I. (2017). Review on Automation Tool for ERD Normalization. International Research Journal of Engineering and Technology (IRJET)[Online]4(2), 1323-1325.

Yilmaz, M., O'Connor, R. V., & Clarke, P. (2015). Software development roles: a multi-project empirical investigation. ACM SIGSOFT Software Engineering Notes40(1), 1-5.

Tenopir, C., Hughes, D., Allard, S., Frame, M., Birch, B., Baird, L., ... & Lundeen, A. (2015). Research data services in academic libraries: Data intensive roles for the future?. Journal of eScience Librarianship4(2), 4.

Pokorný, J. (2018, March). Integration of Relational and NoSQL Databases. In Asian Conference on Intelligent Information and Database Systems (pp. 35-45). Springer, Cham.

Mabogunje, A. (2015). The development process: A spatial perspective. Routledge.