Advanced Querying and Information Retrieval
AdvanceQuerying.ppt (Size: 1.06 MB / Downloads: 20)
Decision Support Systems
Decision-support systems are used to make business decisions often based on data collected by on-line transaction-processing systems.
Examples of business decisions:
what items to stock?
What insurance premium to change?
Who to send advertisements to?
Examples of data used for making decisions
Retail sales transaction details
Customer profiles (income, age, sex, etc.)
Decision-Support Systems: Overview
Data analysis tasks are simplified by specialized tools and SQL extensions
For each product category and each region, what were the total sales in the last quarter and how do they compare with the same quarter last year
As above, for each product category and each customer category
Statistical analysis packages (e.g., : S++) can be interfaced with databases
Statistical analysis is a large field will not study it here
Data mining seeks to discover knowledge automatically in the form of statistical rules and patterns from Large databases.
A data warehouse archives information gathered from multiple sources, and stores it under a unified schema, at a single site.
Important for large businesses which generate data from multiple divisions, possibly at multiple sites
Data may also be purchased externally
Data Analysis and OLAP
Aggregate functions summarize large volumes of data
Online Analytical Processing (OLAP)
Interactive analysis of data, allowing data to be summarized and viewed in different ways in an online fashion (with negligible delay)
Data that can be modeled as dimension attributes and measure attributes are called multidimensional data.
Given a relation used for data analysis, we can identify some of its attributes as measure attributes, since they measure some value, and can be aggregated upon. For instance, the attribute number of the sales relation is a measure attribute, since it measures the number of units sold.
Some of the other attributes of the relation are identified as dimension attributes, since they define the dimensions on which measure attributes, and summaries of measure attributes, are viewed.
Online Analytical Processing
The operation of changing the dimensions used in a cross-tab is called pivoting
Suppose an analyst wishes to see a cross-tab on item-name and color for a fixed value of size, for example, large, instead of the sum across all sizes.
Such an operation is referred to as slicing.
The operation is sometimes called dicing, particularly when values for multiple dimensions are fixed.
The operation of moving from finer-granularity data to a coarser granularity is called a rollup.
The opposite operation - that of moving from coarser-granularity data to finer-granularity data – is called a drill down.