MC Press Online
Welcome to the MC Press Online!
Need help withour eBooks?
Contact PublishersRow.com
Click here, to go to our main store

  MC Press Online eBookStore  

Data Governance Tools
preview of book Data Governance Tools
text of book Data Governance Tools

Data Governance Tools

Author:
Publisher: MC Press Online
Publication Date: 2014
Subject: Computer: Information Technology
Number of Pages: 366

Free Preview    Email to Friend   Add to wish list
 Available as: (for format`s description click on its name)
Individual E-Version (PDF) Individual E-Version (PDF) ISBN: 978-1-58347-844-8  
$32.76
 
 Reg.: $
59.95 per N pages
 You Save: 
$27.19 (45%)
 Online  Open CopyPrint    
all time
Library Edition Library Edition ISBN: 978-1-58347-844-8  
$59.95
 
 Reg.: $
59.95 per N pages
 
hosting service 15% annual fee 55% one-time payment    
Printed Edition   see MC Press Online    
About this title
Data governance is the formulation of policy to optimize, secure, and leverage information as an enterprise asset by aligning the objectives of multiple functions. Data governance programs have traditionally focused on people and process. Cost has historically been a key consideration because data governance programs have often started from scratch, with little to no funding. As a result, Microsoft Excel and SharePoint have been the tools of choice to document and share data governance artifacts. While the marginal cost of these tools is zero, they are often missing critical functionality. Meanwhile, vendors have matured their data governance offerings to the extent that organizations need to consider tools as a critical component of their data governance programs.

It is not always clear, however, what "data governance tools" really mean. In this book, data governance expert Sunil Soares reviews a reference architecture for data governance software tools. He seeks to define the category called "data governance," as well as lay out evaluation criteria for software tools, the vendor landscape, and the alignment with big data.

The book contains five sections:
  • Introduction (to Data Governance and EDM) introduces data governance and the Enterprise Data Management (EDM) reference architecture.
  • Categories of Data Governance Tools discusses key data governance tasks that can be automated by tools for business glossaries, metadata management, data profiling, data quality management, master data management, reference data management, and information policy management.
  • The Integration Between Enterprise Data Management and Data Governance Tools provides an overview of the integration points between EDM tools and data governance. EDM tools relate to data modeling, data integration, analytics and reporting, business process management, data security and privacy, and information lifecycle management.
  • Big Data Governance Tools looks at how data governance tools interact with big data technologies, including Hadoop, NoSQL, stream computing, and text analytics.
  • Evaluation Criteria and the Vendor Landscape discusses evaluation criteria for data governance tools and provides an overview of key vendor platforms, including ASG, Collibra, Global IDs, IBM, Informatica, Orchestra Networks, SAP, and Talend.
About author
Sunil Soares
Sunil Soares is the founder and managing partner of Information Asset, LLC, a consulting firm that specializes in data governance. Prior to this role, Sunil was director of information governance at IBM, where he worked with clients across six continents and multiple industries. Before joining IBM, Sunil consulted with major financial institutions at the Financial Services Strategy Consulting Practice of Booz Allen & Hamilton in New York. Sunil lives in New Jersey and holds an MBA in Finance and Marketing from the University of Chicago Booth School of Business.

The Chief Data Officer Handbook for Data Gocernance is Sunil's fifth book about data governance. His first book, The IBM Data Governance Unified Process, details the almost 100 steps to implement a data governance program. This book has been used by several organizations as the blueprint for their data governance programs and has been translated into Chinese. Sunil's second book, Selling Information Governance to the Business, reviews the best practices to approach information governance by industry and function. Sunil's third book, IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance, focuses on IBM's InfoSphere product. Sunil's fourth book, Big Data Governance, addresses the specific issues associated with the governance of big data.

Contents
CONTENTS
About the Author
Forewords
Preface

PART I--INTRODUCTION
Chapter 1: An Introduction to Data Governance
Definition
Case Study
The Pillars of Data Governance
Summary

Chapter 2: Enterprise Data Management Reference Architecture
EDM Categories
Big Data
Data Governance Tools
Summary

PART II--CATEGORIES OF DATA GOVERNANCE TOOLS
Chapter 3: The Business Glossary
Bulk-Load Business Terms in Excel, CSV, or XML Format
Create Categories of Business Terms
Facilitate Social Collaboration
Automatically Hyperlink Embedded Business Terms
Add Custom Attributes to Business Terms and Other Data Artifacts
Add Custom Relationships to Business Terms and Other Data Artifacts
Add Custom Roles to Business Terms and Other Data Artifacts
Link Business Terms and Column Names to the Associated Reference Data
Link Business Terms to Technical Metadata
Support the Creation of Custom Asset Types
Flag Critical Data Elements
Provide OOTB and Custom Workflows to Manage Business Terms and Other Data Artifacts
Review the History of Changes to Business Terms and Other Data Artifacts
Allow Business Users to Link to the Glossary Directly from Reporting Tools
Search for Business Terms
Integrate Business Terms with Associated Unstructured Data
Summary

Chapter 4: Metadata Management
Pull Logical Models from Data Modeling Tools
Pull Physical Models from Data Modeling Tools
Ingest Metadata from Relational Databases
Pull in Metadata from Data Warehouse Appliances
Integrate Metadata from Legacy Data Sources
Pull Metadata from ETL Tools
Pull Metadata from Reporting Tools
Reflect Custom Code in the Metadata Tool
Pull Metadata from Analytics Tools
Link Business Terms with Column Names
Pull Metadata from Data Quality Tools
Pull Metadata from Big Data Sources
Provide Detailed Views on Data Lineage
Customize Data Lineage Reporting
Manage Permissions in the Metadata Repository
Support the Search for Assets in the Metadata Repository
Summary

Chapter 5: Data Profiling
Conduct Column Analysis
Discover the Values Distribution of a Column
Discover the Patterns Distribution of a Column
Discover the Length Frequencies of a Column
Discover Hidden Sensitive Data
Discover Values with Similar Sounds in a Column
Agree on the Data Quality Dimensions for the Data Governance Program
Develop Business Rules Relating to the Data Quality Dimensions
Profile Data Relating to the Completeness Dimension of Data Quality
Profile Data Relating to the Conformity Dimension of Data Quality
Profile Data Relating to the Consistency Dimension of Data Quality
Profile Data Relating to the Synchronization Dimension of Data Quality
Profile Data Relating to the Uniqueness Dimension of Data Quality
Profile Data Relating to the Timeliness Dimension of Data Quality
Profile Data Relating to the Accuracy Dimension of Data Quality
Discover Data Overlaps Across Columns
Discover Hidden Relationships Between Columns
Discover Dependencies
Discover Data Transformations
Create Virtual Joins or Logical Data Objects That Can Be Profiled
Summary

Chapter 6: Data Quality Management
Transform Data into a Standardized Format
Improve the Quality of Address Data
Match and Merge Duplicate Records
Create a Data Quality Scorecard
View the Data Quality Scorecard
Highlight the Financial Impact Associated with Poor Data Quality
Conduct Time Series Analysis
Manage Data Quality Exceptions
Summary

Chapter 7: Master Data Management
Define Business Terms Consumed by the MDM Hub
Manage Entity Relationships
Manage Master Data Enrichment Rules
Manage Master Data Validation Rules
Manage Record Matching Rules
Manage Record Consolidation Rules
View a List of Outstanding Data Stewardship Tasks
Manage Duplicates
View the Data Stewardship Dashboard
Manage Hierarchies
Improve the Quality of Master Data
Integrate Social Media with MDM
Manage Master Data Workflows
Compare Snapshots of Master Data
Provide a History of Changes to Master Data
Offload MDM Tasks to Hadoop for Faster Processing
Summary

Chapter 8: Reference Data Management
Build an Inventory of Code Tables
Agree on the Master List of Values for Each Code Table
Build Simple Mappings Between Master Values and Related Code Tables
Build Complex Mappings Between Code Values
Manage Hierarchies of Code Values
Build and Compare Snapshots of Reference Data
Visualize Inter-Temporal Crosswalks Between Reference Data Snapshots
Summary

Chapter 9: Information Policy Management
Manage Information Policies, Standards, and Processes Within the Business Glossary
Manage Business Rules
Leverage Data Governance Tools to Monitor and Report on Compliance
Manage Data Issues
Summary

PART III--THE INTEGRATION BETWEEN ENTERPRISE DATA MANAGEMENT
AND DATA GOVERNANCE TOOLS

Chapter 10: Data Modeling
Integrate the Logical and Physical Data Models with the Metadata Repository
Expose Ontologies in the Metadata Repository
Prototype a Unified Schema Across Data Domains Using Data Discovery Tools
Establish a Data Model to Support Master Data Management
Summary

Chapter 11: Data Integration
Deploy Data Quality Jobs in an Integrated Manner with Data Integration
Move Data Between the MDM or Reference Data Hub and the Source Systems
Leverage Reference Data for Use by the Data Integration Tool
Integrate Data Integration Tools into the Metadata Repository
Automate the Production of Data Integration Jobs by Leveraging the Metadata Repository
Summary

Chapter 12: Analytics and Reporting
Export Data Profiling Results to a Reporting Tool for Further Visual Analysis
Export Data Artifacts to a Reporting Tool for the Visualization of Data Governance Metrics
Integrate Analytics and Reporting Tools with the Business Glossary for Semantic Context
Summary

Chapter 13: Business Process Management
Data Governance Workflows Should Leverage BPM Capabilities
Master Data Workflows Should Leverage BPM Capabilities
Data Governance Tools Should Map to BPM Tools
Summary

Chapter 14: Data Security and Privacy
Determine Privacy Obligations
Discover Sensitive Data Using Data Discovery Tools
Flag Sensitive Data in the Metadata Repository
Mask Sensitive Data in Production Environments
Mask Sensitive Data in Non-Production Environments
Monitor Database Access by Privileged Users
Document Information Policies Implemented by Data Masking and
Database Monitoring Tools
Create a Complete Business Object Using Data Discovery Tools That Can Be
Acted Upon by Data Masking Tools
Summary

Chapter 15: Information Lifecycle Management
Document Information Policies in the Business Glossary That Are
Implemented by ILM Tools
Discover Complete Business Objects That Can Be Acted on Efficiently by ILM Tools
Summary

PART IV--BIG DATA GOVERNANCE TOOLS
Chapter 16: Hadoop and NoSQL
Conduct an Inventory of Data in Hadoop
Assign Ownership for Data in Hadoop
Provision a Semantic Layer for Analytics in Hadoop
View the Lineage of Data In and Out of Hadoop
Manage Reference Data for Hadoop
Profile Data Natively in Hadoop
Discover Data Natively in Hadoop
Execute Data Quality Rules Natively in Hadoop
Integrate Hadoop with Master Data Management
Port Data Governance Tools to Hadoop for Improved Performance
Govern Data in NoSQL Databases
Mask Sensitive Data in Hadoop
Summary

Chapter 17: Stream Computing
Use Data Profiling Tools to Understand a Sample Set of Input Data
Govern Reference Data to Be Used by the Stream Computing Application
Govern Business Terms to Be Used by the Stream Computing Application
Summary

Chapter 18: Text Analytics
Big Data Governance to Reduce the Readmission Rate for Patients with
Congestive Heart Failure
Leverage Unstructured Data to Improve the Quality of Sparsely Populated
Structured Data
Extract Additional Relevant Predictive Variables Not Available in Structured Data
Define Consistent Definitions for Key Business Terms
Ensure Consistency in Patient Master Data Across Facilities
Adhere to Privacy Requirements
Manage Reference Data
Summary

PART V--EVALUATION CRITERIA AND THE VENDOR LANDSCAPE
Chapter 19: The Evaluation Criteria for Data Governance Platforms
The Total Cost of Ownership
Data Stewardship
Approval Workflows
The Hierarchy of Data Artifacts
Data Governance Metrics
The Cloud
Summary

Chapter 20: ASG
ASG-metaGlossary
ASG-Rochade
ASG-becubic

Chapter 21: Collibra
Business Glossary
Reference Data Management
Data Stewardship
Workflows
Metadata
Data Profiling

Chapter 22: Global IDs
Data Profiling
Data Quality
Metadata

Chapter 23: IBM
Metadata
Information Integration
Data Quality
Master Data Management
Data Lifecycle Management
Privacy and Security

Chapter 24: Informatica
Data Profiling and Data Quality
Metadata and Business Glossary
Master Data Management
Information Lifecycle Management
Security and Privacy
Cloud

Chapter 25: Orchestra Networks
Workflows
Data Modeling
Master Data Management
Reference Data Management
Business Glossary

Chapter 26: SAP
An In-Memory Database
Data Quality and Metadata Management
Master Data Management
Content Management
Information Lifecycle Management
Enterprise Modeling
Data Integration

Chapter 27: Talend
The Extended Ecosystem
Big Data
Data Integration
Data Quality
Master Data Management
Enterprise Service Bus (ESB)
Business Process Management (BPM)

Chapter 28: Notable Vendors
Adaptive
BackOffice Associates
Data Advantage Group
Diaku
Embarcadero Technologies
Global Data Excellence
Harte-Hanks Trillium
Oracle
SAS

Appendix A: List of Acronyms
Appendix B: Glossary
Appendix C: Potential Data Governance Tasks to Be Automated with Tools
Index
Related titles
Big Data GovernanceBig Data Governance
IBM InfoSphereIBM InfoSphere
Selling Information Governance to the BusinessSelling Information Governance to the Business
Chief Data Officer Handbook for Data Governance, TheChief Data Officer Handbook for Data Governance, The
IBM Data Governance Unified Process, TheIBM Data Governance Unified Process, The
 
  Special Offer Code  
Enter your Special Offer Code here:
  Search for  

  Our Products  
Browse all »»
From Idea to Print, Chapter 10: Last Steps
You Want to Do What with PHP?
Advanced Guide to PHP on IBM i

If download option is selected, Adobe Acrobat 5.0 or lateris requiredto read our e-books*


*Windows PC, Mac OS9/OSX, and Linux