Please enable JavaScript.
Coggle requires JavaScript to display documents.
FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT…
FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT
Problems in traditional data management
File organization terms & concepts
Problems with traditional file environment
Data redundancy:
Duplicate data; same data are stored in more than one place
Data inconsistency:
The same data attributes may have different values, e.g. extra large and XL
Program-data dependence:
Changes in programs require changes to the data
Lack of flexibility:
Can't deliver ad hoc reports/respond to unanticipated info requirements in time
Poor security:
Little control/management of data; access to and dissemination of info may be out of control
Lack of data sharing & availability:
Infor can't flow freely across different functional areas/different parts of the org
Database management systems
Definition:
Software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs
Solving problems of traditional file environment
Reduces data redundancy and inconsistency by minimizing isolated file with the same data being repeated
Eliminates data inconsistency by ensuring every occurrence of redundant data has the same values
Uncouples programs & data, enabling data to stand on their own
Enables org to centrally manage data, their use, and security
Relational DBMS
represents data as 2D tables (= file), each containing data on an entity and its attributes
Rows = records = tuples
Operations of relational DBMS
Relational database tables can be combined easily to deliver data required by users, provided any two tables share a common data element
Has 3 basic operations:
Select:
Creates subset of rows that meet certain criteria
Join:
Combines relational tables to provide user with more info than is available in individual tables
Project:
Creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the required info
Capabilities of DBMS
Data definition:
Specifies structure of the content of the database; Creates database tables and defines characteristics of the fields in each table
Data dictionary:
Automated/manual file that stores definitions of data elements and their characteristics
Data manipulation language:
Used to add, change, delete, and retrieve data in database; Contains commands to satisfy information requests and develop applications
e.g. Structured Query Language (SQL)
Designing databases
Conceptual/logical design:
Groups of data are organized, refined, and streamlined until an overall logical view of the relationships among all the data in the database emerges
Physical design:
Shows how database is actually arranged on direct-access storage devices
Referential integrity:
Rules to ensure that relationships between coupled tables remain consistent
Entity-relationship diagram:
Illustrates the relationship between the entities, lines connecting the boxed represent relationships
- One-to-one relationship:
Represented by a line that ends in two short marks
- One-to-many relationships:
Represented by a line that ends with a crow's foot topped by a short mark
Non-relational databases
Use a more flexible data model and are designed for managing large data sets across many distributed machines and for easily scaling up or down
Useful for accelerating simple queries against large volumes of structured and unstructured data
Cloud databases:
Appeals to web-focused start-ups or SME seeking database capabilities at a lower price than in-house database products
Database tools & tech for business
Big data
Definition:
Data sets with volumes so huge, they're beyond the ability of typical DBMS to capture, store, and analyze
Why big data?
Reveals more patters & relationships than smaller data sets
Potentially provides insights into customer behavior, weather patters, financial markets
Business intelligence infrastructure
Data warehouse:
Database that stores current and historical data of potential interest to decision makers; makes data available for anyone to access as needed but the data can't be altered
Data mart:
Subset of data warehouse that summarized a portion of data in a separate database for specific population of users
Hadoop:
Open source software; enables distributed parallel processing of huge amounts of data across inexpensive computers
In-memory computing:
Facilitates big data analysis; relies on RAM for data storage
Analytic platforms:
Hardware-software systems specifically designed for query processing and analytics
Analytical tools
Online analytical processing (OLAP)
supports multidimensional data analysis, enabling users to view the same data in different ways
Data mining
provides insights by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior
Associations:
Occurrences linked to a single event,
e.g. when corn chips are purchased, 65% of the time, cola will also be purchased, but when there's promotion, cola is purchased 85% purchased of the time. conclusion: promotions are profitable
Sequences:
Events are linked over time,
e.g. if a house is purchased, a new fridge will be purchased within 2 weeks 65% of the time, new oven will be bought within a month 45% of the time
Classification:
Recognizes patters that describe group an item belongs by inferring a set of rules of existing classified items,
e.g. identify characteristics customers likely to leave so management can devise special campaigns for them
Clustering:
Discovers different groups within data,
e.g. discovers groups based on demographics
Forecasting:
Uses series of existing values to forecast what other values will be,
e.g. estimate future value of sales figures
Text mining:
Extract key elements from unstructured big data sets, discover patterns and relationships, and summarize the info
Sentiment analysis:
Detect favorable and unfavorable opinions about specific subjects
Web mining:
Extracts knowledge from the content of webpages, e.g. Google Trends
Databases and the web
e.g. Customers using the web to place an order or view a product catalog
Advantages
Easier to use than proprietary query tools
Web interface required few or no changes to internal database
Essentials for firm data management
Information policy:
Rules on how data are to be org and maintained, who's allowed to view/change data
Data quality audit:
Structured survey of the accuracy and level of completeness of data in an IS
Data cleansing/data scrubbing:
Detecting and correcting data in a database that incorrect, incomplete, improperly formatted, or redundant
Fadhila Abidah
09111840000092
Business Information Systems Q