Of Forests and Trees...

Institutional Data Definitions and Decision Support - RMAIR 2011 »
Christina Drum

centers on the creation and implementation of institutional data definitions
Institutional Data Definitions 
and Decision Support
Of Forests and Trees: 
Data Dictionary
Identify the informational elements. 
Christina Drum
Information Architect & Metadata Manager
Office of Institutional Analysis & Planning
University of Nevada, Las Vegas

RMAIR Conference - Albuquerque, NM
October 27, 2011
Context is Everything
Large * Public * Urban
Research University
~28,000 Students
~2,900 Faculty & Staff
220+ Programs

In Nevada * NSHE
Overview
New ERP System - Student Information

Reporting/Data Warehousing lagged behind the implementation

Our IR office tasked with the enterprise DW and BI platform
Our Situation
Our Resources
Collective skills and experience
IR understanding of data needs and institutional reporting priorities
Our Immediate Priorities

Critical reporting needs

Enrollment and Admissions data

Solution for point-in-time snapshots

Tomorrow at 10:15
Mike Ellison
Our Understanding
Accessible, reliable 
data definitions are fundamental to successful information delivery.
Decision Support
Information Delivery
UNLV Data Warehouse
Campus Collaboration
Provide knowledgeable campus users with access to data and information for decision-making
Central Data Warehouse
Distributed Reporting
Central Data Definitions
How?
Data Warehousing
In the Old World...
Transactional 
Systems
Data Warehouse
Information
Delivery
Data definitions were organized around a physical data structure.
Transactional Context
Informational Context
Informational Elements
and their Implementations...
...as a column in a relational database
...as a dimension in a data model
...as a presentation element in the BI tool
...as on a chart, in a report
...etc.
Metadata
Data Mart Development
Metadata Repository Model
Data Governance
Data Cookbook
Wiki software
Consensus
Christina Drum
christina.drum@unlv.edu

http://ir.unlv.edu
Resources
Reflections
Definition Management
(About Us > Professional Presentations)
Staff of 6 + 2 additional positions
"data about data"
Institutional Data Definitions
Institutional Data Definitions
Technical Database Metadata
(schemas, tables, columns)
Business Process Metadata
(operational processes in functional areas)
Technical Process Metadata
(nightly processing, application feeds)
ETL Metadata
(data warehousing jobs)
BI Platform Metadata
(physical, logical modeling, presentation)
Reports/Dashboards Metadata
(reporting systems, institutional dashboards)
Data Stewardship Metadata
(people responsible for accuracy, integrity)
Lineage Metadata
(ancestor/descendent relationships)
We had several types of metadata that we eventually wanted to bring together...
We knew it would take years to build it all out.

While we started with what was most immediate, we knew that we would want the flexibility to:
   
        1) associate metadata across these different areas
        2) connect institutional data definitions with any type of metadata

So, we designed a numbering system that would ensure any element in any of these areas would have a unique ID.

That way, we could build out relationships in the future, by associating one global ID with another. 

"data about the containers of data"
"data about data relationships"
We Built Our Own
SQL Server database
Looked at some products
We knew what we wanted.
Had the relevant skills and tools.
Would have to learn/customize any purchased option.
What we were doing initially was limited in scope.
We started with:
Data Definitions
(informational elements)
 Common ID
*Name
*Description
*Interpretation/Usage Notes
*Potential Values
*Source System
*Source Description

 Group(s)
*Related Definitions

 Definition Comments
 Definition Status
Implementations
(of data definitions)
Relational Database Table Columns
BI Presentation Elements
ETL Jobs
Examples
Started with data elements in the old system, and what we knew about requirements for enrollment reporting.

Used a GoogleDocs spreadsheet, initially, for tracking.

Collaborated with staff in Enrollment & Student Services to verify transactional source data and logic.
Write and review the defintions.
Reviewed methods and standards for writing definitions
Developed some "model" definitions
Established a workflow process and divided up the work
Initially assigned to a team of two
Internal review within our office
"Final" data steward review
Active definition

One person managing the process
Metadata Management Application
Front-end web application that presents data definitions to users of the BI tool
Additional Data Marts...
Degrees Conferred
Student Financials
Enrollment Management
Retention & Graduation
Financial Aid
Data Definition/Data Mart Project Phases
involves a team comprised of data stewards, data users and IAP development staff
1. List Informational Needs and Elements.
2. Draft Data Definitions.
3. Review and Refine Data Definitions.
4. Implement Data Definitions in the UNLV Data Warehouse.
5. Construct the Data Mart.
6. Review and Refine the Data Mart.
7. Open the Data Mart for Institutional Use.


a structure for strategic collaboration around the institution's informational assets 
Strategic data elements
Transactional data elements from which informational elements are extracted/derived

Communication around business process changes

A next step: Lineage metadata
Navigate the trees; understand the forest.  
Adjust and manage expectations.
Become more agile.
Make progress before understanding it all.
How our IR office leads a collaborative effort to 
develop institutional 
data definitions, 
while implementing 
a new enterprise data 
warehouse and BI platform.
"metacontent"
Definition Methods
Genus and Species
   Active student - A student who has a Career Program with an "active" status. 

Synonym
   Units taken - The number of student credit hours for the enrollment.

By Example
   Term of Enrollment - A four-digit code indicating the semester and year of    
   enrollment, as in:
                 '2108' = Fall 2010, where
                  2xxx = century (2000)
                  x10x = year
                  xxx8 = fall (2/5/8 -> spring/summer/fall)

Complete Enumeration
   Race/Ethnicity - IPEDS Reporting - A code associated with one of the   
   following mutually exclusive IPEDS reporting categories:
                  1) Nonresident alien
                  2) Hispanics of any race
                  3) American Indian or Alaskan Native
                  4) Asian
                  5) Black or African American
                  6) Native Hawaiian or Other Pacific Islander
                  7) White
                  8) Two or more races
                  9) Race/ethnicity unknown

Definition Standards
Describe essential features concisely, with precision and accuracy









Avoid circularity

Avoid language that is...
   
        Vague - "The person in the room."
        Ambiguous - "The person sitting next to you in this room."
        Obscure - "The hominid with the utmost propinquity to you."
        Metaphorical - "The life of the party."

Not too broad - "Humans are two-legged animals."

Not too narrow - "Humans are religious animals."

Employ principles of classification 
   (e.g., consistent, mutually exclusive, jointly exhaustive)

Attend to details; don't stay stuck in minutiae.
Make better mistakes tomorrow.
HEDW - Higher Education Data Warehousing Forum (http://hedw.org)
TDWI - The Data Warehousing Institute (http://tdwi.org)
Oregon University System - Student Centralized Administrative Reporting File (SCARF) - (http://www.ous.edu/dept/ir/scarf)
Cal Poly - Student Administration Business Rules (http://www.polydata.calpoly.edu/business_rules/SA_BusinessRules.html)
Inmon, William, Bonnie O'Neil and Lowell Fryman, _Business Metadata_. Burlington, MA: Elsevier, 2008.
Marco, David, _Building and Managing the Meta Data Repository: A Full Lifecycle Guide_. New York, NY: John Wiley & Sons, 2000.
Organizations
Examples
Books

Loading comments...

Please log in to add your comment.

Report abuse