Data, Information, and Databases
THE BUSINESS BENEFITS OF HIGH-QUALITY INFORMATION
LO 6.1: Explain the four primary traits that determine the value of information.
Information is powerful. Information can tell an organization how its current operations are performing and help it estimate and strategize about how future operations might perform. The ability to understand, digest, analyze, and filter information is key to growth and success for any professional in any industry. Remember that new perspectives and opportunities can open up when you have the right data that you can turn into information and ultimately business intelligence.
Information is everywhere in an organization. Managers in sales, marketing, human resources, and management need information to run their departments and make daily decisions. When addressing a significant business issue, employees must be able to obtain and analyze all the relevant information so they can make the best decision possible. Information comes at different levels, formats, and granularities. refers to the extent of detail within the information (fine and detailed or coarse and abstract). Employees must be able to correlate the different levels, formats, and granularities of information when making decisions. For example, a company might be collecting information from various suppliers to make needed decisions, only to find that the information is in different levels, formats, and granularities. One supplier might send detailed information in a spreadsheet, whereas another supplier might send summary information in a Word document, and still another might send a collection of information from emails. Employees will need to compare these differing types of information for what they commonly reveal to make strategic decisions. displays the various levels, formats, and granularities of organizational information.
Successfully collecting, compiling, sorting, and finally analyzing information from multiple levels, in varied formats, and exhibiting different granularities can provide tremendous insight into how an organization is performing. Exciting and unexpected results can include potential new markets, new ways of reaching customers, and even new methods of doing business. After understanding the different levels, formats, and granularities of information, managers next want to look at the four primary traits that help determine the value of information (see ).
Information Type: Transactional and Analytical
As discussed previously in the text, the two primary types of information are transactional and analytical. Transactional information encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks. Organizations need to capture and store transactional information to perform operational tasks and repetitive decisions such as analyzing daily sales reports and production schedules to determine how much inventory to carry. Consider Walmart, which handles more than 1 million customer transactions every hour, and Facebook, which keeps track of 400 million active users (along with their photos, friends, and web links). In addition, every time a cash register rings up a sale, a deposit or withdrawal is made from an ATM, or a receipt is given at the gas pump, the transactional information must be captured and stored.
Levels, Formats, and Granularities of Organizational Information
Analytical information encompasses all organizational information, and its primary purpose is to support the performance of managerial analysis tasks. Analytical information is useful when making important decisions such as whether the organization should build a new manufacturing plant or hire additional sales personnel. Analytical information makes it possible to do many things that previously were difficult to accomplish, such as spot business trends, prevent diseases, and fight crime. For example, credit card companies crunch through billions of transactional purchase records to identify fraudulent activity. Indicators such as charges in a foreign country or consecutive purchases of gasoline send a red flag highlighting potential fraudulent activity.
Walmart was able to use its massive amount of analytical information to identify many unusual trends, such as a correlation between storms and Pop-Tarts. Yes, Walmart discovered an increase in the demand for Pop-Tarts during the storm season. Armed with that valuable information, the retail chain was able to stock up on Pop-Tarts that were ready for purchase when customers arrived. displays different types of transactional and analytical information.
The Four Primary Traits of the Value of Information
Timeliness is an aspect of information that depends on the situation. In some firms or industries, information that is a few days or weeks old can be relevant, whereas in others information that is a few minutes old can be almost worthless. Some organizations, such as 911 response centers, stock traders, and banks, require up-to-the-second information. Other organizations, such as insurance and construction companies, require only daily or even weekly information.
means immediate, up-to-date information. provide real-time information in response to requests. Many organizations use real-time systems to uncover key corporate transactional information. The growing demand for real-time information stems from organizations’ need to make faster and more effective decisions, keep smaller inventories, operate more efficiently, and track performance more carefully. Information also needs to be timely in the sense that it meets employees’ needs, but no more. If employees can absorb information only on an hourly or daily basis, there is no need to gather real-time information in smaller increments.
Most people request real-time information without understanding one of the biggest pitfalls associated with real-time information—continual change. Imagine the following scenario: Three managers meet at the end of the day to discuss a business problem. Each manager has gathered information at different times during the day to create a picture of the situation. Each manager’s picture may be different because of the time differences. Their views on the business problem may not match because the information they are basing their analysis on is continually changing. This approach may not speed up decision making, and it may actually slow it down. Business decision makers must evaluate the timeliness of the information for every decision. Organizations do not want to find themselves using real-time information to make a bad decision faster.
Business decisions are only as good as the quality of the information used to make them. occurs when the same data element has different values. Take for example the amount of work that needs to occur to update a customer who had changed her last name due to marriage. Changing this information in only a few organizational systems will lead to data inconsistencies causing customer 123456 to be associated with two last names. occur when a system produces incorrect, inconsistent, or duplicate data. Data integrity issues can cause managers to consider the system reports invalid and will make decisions based on other sources.
Transactional versus Analytical Information
Five Common Characteristics of High-Quality Information
To ensure that your systems do not suffer from data integrity issues, review for the five characteristics common to high-quality information: accuracy, completeness, consistency, timeliness, and uniqueness. provides an example of several problems associated with using low-quality information, including:
1. Completeness. The customer’s first name is missing.
2.Another issue with completeness. The street address contains only a number and not a street name.
3. Consistency. There may be a duplication of information since there is a slight difference between the two customers in the spelling of the last name. Similar street addresses and phone numbers make this likely.
Example of Low-Quality Information
APPLY YOUR KNOWLEDGE
BUSINESS DRIVEN MIS
Determining Information Quality Issues
Real People magazine is geared toward working individuals and provides articles and advice on everything from car maintenance to family planning. The magazine is currently experiencing problems with its distribution list. More than 30 percent of the magazines mailed are returned because of incorrect address information, and each month it receives numerous calls from angry customers complaining that they have not yet received their magazines. Below is a sample of Real People’s customer information. Create a report detailing all the issues with the information, potential causes of the information issues, and solutions the company can follow to correct the situation.
4. Accuracy. This may be inaccurate information because the customer’s phone and fax numbers are the same. Some customers might have the same number for phone and fax, but the fact that the customer also has this number in the email address field is suspicious.
5.Another issue with accuracy. There is inaccurate information because a phone number is located in the email address field.
6.Another issue with completeness. The information is incomplete because there is not a valid area code for the phone and fax numbers.
Nestlé uses 550,000 suppliers to sell more than 100,000 products in 200 countries. However, due to poor information, the company was unable to evaluate its business effectively. After some analysis, it found that it had 9 million records of vendors, customers, and materials, half of which were duplicated, obsolete, inaccurate, or incomplete. The analysis discovered that some records abbreviated vendor names, and other records spelled out the vendor names. This created multiple accounts for the same customer, making it impossible to determine the true value of Nestlé’s customers. Without being able to identify customer profitability, a company runs the risk of alienating its best customers.
Knowing how low-quality information issues typically occur can help a company correct them. Addressing these errors will significantly improve the quality of company information and the value to be extracted from it. The four primary reasons for low-quality information are:
1.Online customers intentionally enter inaccurate information to protect their privacy.
2.Different systems have different information entry standards and formats.
3.Data-entry personnel enter abbreviated information to save time or erroneous information by accident.
4.Third-party and external information contains inconsistencies, inaccuracies, and errors.
Understanding the Costs of Using Low-Quality Information Using the wrong information can lead managers to make erroneous decisions. Erroneous decisions in turn can cost time, money, reputations, and even jobs. Some of the serious business consequences that occur due to using low-quality information to make decisions are:
Inability to track customers accurately.
Difficulty identifying the organization’s most valuable customers.
Inability to identify selling opportunities.
Lost revenue opportunities from marketing to nonexistent customers.
The cost of sending undeliverable mail.
Difficulty tracking revenue because of inaccurate invoices.
Inability to build strong relationships with customers.
Understanding the Benefits of Using High-Quality Information High-quality information can significantly improve the chances of making a good decision and directly increase an organization’s bottom line. One company discovered that even with its large number of golf courses, Phoenix, Arizona, is not a good place to sell golf clubs. An analysis revealed that typical golfers in Phoenix are tourists and conventioneers who usually bring their clubs with them. The analysis further revealed that two of the best places to sell golf clubs in the United States are Rochester, New York, and Detroit, Michigan. Equipped with this valuable information, the company was able to place its stores strategically and launch its marketing campaigns.
High-quality information does not automatically guarantee that every decision made is going to be a good one, because people ultimately make decisions and no one is perfect. However, such information ensures that the basis of the decisions is accurate. The success of the organization depends on appreciating and leveraging the true value of timely and high-quality information.
Information is a vital resource, and users need to be educated on what they can and cannot do with it. To ensure that a firm manages its information correctly, it will need special policies and procedures establishing rules on how the information is organized, updated, maintained, and accessed. Every firm, large and small, should create an information policy concerning data governance. Data governance refers to the overall management of the availability, usability, integrity, and security of company data. is the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems. MDM is commonly included in data governance. A company that supports a data governance program has a defined a policy that specifies who is accountable for various portions or aspects of the data, including its accuracy, accessibility, consistency, timeliness, and completeness. The policy should clearly define the processes concerning how to store, archive, back up, and secure the data. In addition, the company should create a set of procedures identifying accessibility levels for employees. Then, the firm should deploy controls and procedures that enforce government regulations and compliance with mandates such as Sarbanes-Oxley.
STORING INFORMATION USING A RELATIONAL DATABASE MANAGEMENT SYSTEM
LO 6.2: Describe a database, a database management system, and the relational database model.
The core component of any system, regardless of size, is a database and a database management system. Broadly defined, a database maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses). A database management system (DBMS) creates, reads, updates, and deletes data in a database while controlling access and security. Managers send requests to the DBMS, and the DBMS performs the actual manipulation of the data in the database. Companies store their information in databases, and managers access these systems to answer operational questions such as how many customers purchased Product A in December or what the average sales were by region. Two primary tools are available for retrieving information from a DBMS. First is a that helps users graphically design the answer to a question against a database. Second is a that asks users to write lines of code to answer questions against a database. Managers typically interact with QBE tools, and MIS professionals have the skills required to code SQL. displays the relationship between a database, a DBMS, and a user. Some of the more popular examples of DBMS include MySQL, Microsoft Access, SQL Server, FileMaker, Oracle, and FoxPro.
APPLY YOUR KNOWLEDGE
BUSINESS DRIVEN DEBATE
Excel or Access?
Excel is a great tool with which to perform business analytics. Your friend, John Cross, owns a successful publishing company specializing in Do It Yourself books. John started the business 10 years ago and has slowly grown to 50 employees and $1 million in sales. John has been using Excel to run the majority of his business, tracking book orders, production orders, shipping orders, and billing. John even uses Excel to track employee payroll and vacation dates. To date, Excel has done the job, but as the company continues to grow, the tool is becoming inadequate.
You believe John could benefit from moving from Excel to Access. John is skeptical of the change because Excel has done the job up to now, and his employees are comfortable with the current processes and technology. John has asked you to prepare a presentation explaining the limitations of Excel and the benefits of Access. In a group, prepare the presentation that will help convince John to make the switch.
A is the smallest or basic unit of information. Data elements can include a customer’s name, address, email, discount rate, preferred shipping method, product name, quantity ordered, and so on. Data models are logical data structures that detail the relationships among data elements by using graphics or pictures.
provides details about data. For example, metadata for an image could include its size, resolution, and date created. Metadata about a text document could contain document length, data created, author’s name, and summary. Each data element is given a description, such as Customer Name; metadata is provided for the type of data (text, numeric, alphanumeric, date, image, binary value) and descriptions of potential predefined values such as a certain area code; and finally the relationship is defined. A compiles all of the metadata about the data elements in the data model. Looking at a data model along with reviewing the data dictionary provides tremendous insight into the database’s functions, purpose, and business rules.
DBMS use three primary data models for organizing information—hierarchical, network, and the relational database, the most prevalent. A relational database model stores information in the form of logically related two-dimensional tables. A allows users to create, read, update, and delete data in a relational database. Although the hierarchical and network models are important, this text focuses only on the relational database model.
Relationship of Database, DBMS, and User
Storing Data Elements in Entities and Attributes
For flexibility in supporting business operations, managers need to query or search for the answers to business questions such as which artist sold the most albums during a certain month. The relationships in the relational database model help managers extract this information. illustrates the primary concepts of the relational database model—entities, attributes, keys, and relationships. An entity (also referred to as a table) stores information about a person, place, thing, transaction, or event. The entities, or tables, of interest in are TRACKS, RECORDINGS, MUSICIANS, and CATEGORIES. Notice that each entity is stored in a different two-dimensional table (with rows and columns).
(also called columns or fields) are the data elements associated with an entity. In , the attributes for the entity TRACKS are TrackNumber, TrackTitle, TrackLength, and RecordingID. Attributes for the entity MUSICIANS are MusicianID, MusicianName, MusicianPhoto, and MusicianNotes. A record is a collection of related data elements (in the MUSICIANS table, these include “3, Lady Gaga, , Do not bring young kids to live shows”). Each record in an entity occupies one row in its respective table.
Creating Relationships Through Keys
To manage and organize various entities within the relational database model, you use primary keys and foreign keys to create logical relationships. A primary key is a field (or group of fields) that uniquely identifies a given record in a table. In the table RECORDINGS, the primary key is the field RecordingID that uniquely identifies each record in the table. Primary keys are a critical piece of a relational database because they provide a way of distinguishing each record in a table; for instance, imagine you need to find information on a customer named Steve Smith. Simply searching the customer name would not be an ideal way to find the information because there might be 20 customers with the name Steve Smith. This is the reason the relational database model uses primary keys to identify each record uniquely. Using Steve Smith’s unique ID allows a manager to search the database to identify all information associated with this customer.
Primary Concepts of the Relational Database Model
APPLY YOUR KNOWLEDGE
BUSINESS DRIVEN START-UP
2 Trillion Rows of Data Analyzed Daily—No Problem
eBay is the world’s largest online marketplace, with 97 million global users selling anything to anyone at a yearly total of $62 billion—more than $2,000 every second. Of course with this many sales, eBay is collecting the equivalent of the Library of Congress worth of data every three days that must be analyzed to run the business successfully. Luckily, eBay discovered Tableau!
Tableau started at Stanford when Chris Stolte, a computer scientist; Pat Hanrahan, an Academy Award–winning professor; and Christian Chabot, a savvy business leader, decided to solve the problem of helping ordinary people understand big data. The three created Tableau, which bridged two computer science disciplines: computer graphics and databases. No more need to write code or understand the relational database keys and categories; users simply drag and drop pictures of what they want to analyze. Tableau has become one of the most successful data visualization tools on the market, winning multiple awards, international expansion, and millions in revenue and spawning multiple new inventions.
Tableau is revolutionizing business analytics, and this is only the beginning. Visit the Tableau website and become familiar with the tool by watching a few of the demos. Once you have a good understanding of the tool, create three questions eBay might be using Tableau to answer, including the analysis of its sales data to find patterns, business insights, and trends.
A foreign key is a primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables. For instance, Black Eyed Peas in is one of the musicians appearing in the MUSICIANS table. Its primary key, MusicianID, is “2.” Notice that MusicianID also appears as an attribute in the RECORDINGS table. By matching these attributes, you create a relationship between the MUSICIANS and RECORDINGS tables that states the Black Eyed Peas (MusicianID 2) have several recordings, including The E.N.D., Monkey Business, and Elepunk. In essence, MusicianID in the RECORDINGS table creates a logical relationship (who was the musician that made the recording) to the MUSICIANS table. Creating the logical relationship between the tables allows managers to search the data and turn it into useful information.
Coca Cola Relational Database Example
illustrates the primary concepts of the relational database model for a sample order of soda from Coca Cola. offers an excellent example of how data is stored in a database. For example, the order number is stored in the ORDER table, and each line item is stored in the ORDER LINE table. Entities include CUSTOMER, ORDER, ORDER LINE, PRODUCT, and DISTRIBUTOR. Attributes for CUSTOMER include Customer ID, Customer Name, Contact Name, and Phone. Attributes for PRODUCT include Product ID, Description, and Price. The columns in the table contain the attributes.
Consider Hawkins Shipping, one of the distributors appearing in the DISTRIBUTOR table. Its primary key, Distributor ID, is DEN8001. Distributor ID also appears as an attribute in the ORDER table. This establishes that Hawkins Shipping (Distributor ID DEN8001) was responsible for delivering orders 34561 and 34562 to the appropriate customer(s). Therefore, Distributor ID in the ORDER table creates a logical relationship (who shipped what order) between ORDER and DISTRIBUTOR.
Potential Relational Database for Coca-Cola Bottling Company of Egypt (TCCBCE)
USING A RELATIONAL DATABASE FOR BUSINESS ADVANTAGES
LO 6.3: Identify the business advantages of a relational database.
Many business managers are familiar with Excel and other spreadsheet programs they can use to store business data. Although spreadsheets are excellent for supporting some data analysis, they offer limited functionality in terms of security, accessibility, and flexibility and can rarely scale to support business growth. From a business perspective, relational databases offer many advantages over using a text document or a spreadsheet, as displayed in .
Databases tend to mirror business structures, and a database needs to handle changes quickly and easily, just as any business needs to be able to do. Equally important, databases need to provide flexibility in allowing each user to access the information in whatever way best suits his or her needs. The distinction between logical and physical views is important in understanding flexible database user views. The deals with the physical storage of information on a storage device. The focuses on how individual users logically access information to meet their own particular business needs.
In the database illustration from , for example, one user could perform a query to determine which recordings had a track length of four minutes or more. At the same time, another user could perform an analysis to determine the distribution of recordings as they relate to the different categories. For example, are there more R&B recordings than rock, or are they evenly distributed? This example demonstrates that although a database has only one physical view, it can easily support multiple logical views that provide for flexibility.
Consider another example—a mail-order business. One user might want a report presented in alphabetical format, in which case, the last name should appear before first name. Another user, working with a catalog mailing system, would want customer names appearing as first name and then last name. Both are easily achievable but different logical views of the same physical information.
Increased Scalability and Performance
In its first year of operation, the official website of the American Family Immigration History Center, , generated more than 2.5 billion hits. The site offers immigration information about people who entered America through the Port of New York and Ellis Island between 1892 and 1924. The database contains more than 25 million passenger names that are correlated to 3.5 million images of ships’ manifests.
The database had to be scalable to handle the massive volumes of information and the large numbers of users expected for the launch of the website. In addition, the database needed to perform quickly under heavy use. Some organizations must be able to support hundreds or thousands of users, including employees, partners, customers, and suppliers, who all want to access and share the same information. Databases today scale to exceptional levels, allowing all types of users and programs to perform information-processing and information-searching tasks.
Business Advantages of a Relational Database
Reduced Information Redundancy
is the duplication of data, or the storage of the same data in multiple places. Redundant data can cause storage issues along with data integrity issues, making it difficult to determine which values are the most current or most accurate. Employees become confused and frustrated when faced with incorrect information causing disruptions to business processes and procedures. One primary goal of a database is to eliminate information redundancy by recording each piece of information in only one place in the database. This saves disk space, makes performing information updates easier, and improves information quality.
Increased Information Integrity (Quality)
is a measure of the quality of information. are rules that help ensure the quality of information. The database design needs to consider integrity constraints. The database and the DBMS ensures that users can never violate these constraints. There are two types of integrity constraints: (1) relational and (2) business critical.