Practical Guidelines for Implementing a Data Mesh

Data Catalog, Data Fabric, Data Products, Data Marketplace

Description

Most companies today are storing data and running applications in a hybrid multi-Cloud environment. Analytical systems tend to be centralised and siloed like Data Warehouses and Data Marts for BI, Hadoop or Cloud storage Data Lakes for Data Science and stand-alone streaming analytical systems for real-time analysis. These centralised systems rely on Data Engineers and Data Scientists working within each silo to ingest data from many different sources, clean and integrate it for use in a specific analytical system or Machine Learning models. There are many issues with this centralised, siloed approach including multiple tools to prepare and integrate data, reinvention of data integration pipelines in each silo and centralised data engineering with poor understanding of source data unable to keep pace with Business demands for new data. Also Master Data is not well managed.

To address these issues, a new approach has emerged attempting to accelerate creation of data for use in multiple analytical workloads. That approach is Data Mesh.

This 2-day class looks at Data Mesh in detail and examines its strengths, and weaknesses. It also looks at the strengths and weaknesses of Data Mesh implementation options. Which Architecture is best to implement this? How do you co-ordinate multiple domain-oriented teams and use common data infrastructure software like Data Fabric to create high-quality, compliant, reusable, data products in a Data Mesh. Also how can you use a data marketplace to share data products? The objective is to shorten time to value while also ensuring that data is correctly governed and engineered in a decentralised environment.

It also looks at the organisational implications of Data Mesh and how to create sharable data products for Master Data Management and for use in multi-dimensional analysis on a Data Warehouse, Data Science, Graph Analysis and real-time streaming Analytics to drive business value? Technologies discussed includes data Catalogs, Data Fabric for collaborative development of data integration pipelines to create data products, DataOps to speed up the process, Data Orchestration automation, data marketplaces and data governance platforms.

What you will learn

• Strengths and weaknesses of centralised data Architectures used in Analytics
• The problems caused in existing analytical systems by a hybrid, multi-Cloud data  
   landscape
• What is a Data Mesh and how does it differ from a Data Lake and a Data
   Lakehouse?
• What benefits does Data Mesh offer and what are the implementation options?
• What are the principles, requirements, and challenges of implementing these
   approaches?
• How to organise to create data products in a decentralised environment so you
   avoid chaos
• The critical importance of a data Catalog in understanding what data is available
• How business glossaries can help ensure data products are understood and
   semantically linked
• An operating model for effective Federated Data Governance
• What software is required to build, operate and govern a Data Mesh of data
   products for use in a Data Lake, a Data Lakehouse or Data Warehouse?
• What is Data Fabric software, how does it integrates with data Catalogs and
   connect to data in your data estate
• An implementation methodology to produce ready-made, trusted, reusable data
   products
• Collaborative domain-oriented development of modular and distributed DataOps
   pipelines to create data products
• How a data Catalog and automation software can be used to generate DataOps
   pipelines
• Managing data quality, privacy, access security, versioning, and the lifecycle of data
   products
• Publishing semantically linked data products in a data marketplace for others to
   consume and use
• Consuming data products in an MDM system
• Consuming and assembling data products in multiple analytical systems like Data
   Warehouses, Lakehouses and graph databases to shorten time to value

Main Topics

• What is a Data Mesh, a Data Lake and a Lakehouse? Why use them?
• Methodologies for creating Data Products
• Using a Business Glossary to define Data Products
• Standardising development and operations in a Data Mesh, Data Lake or Lakehouse
• Building DataOps Pipelines to create Multi-Purpose Data Products
• Implementing Federated Data Governance to produce and use compliant Data
  Products

Speaker

Mike Ferguson
Mike Ferguson

Date

15 - 16 Apr 2024
Expired!

Timing: from 9.30 am to 5 pm Italian time

Location

Online event
Share on:
Facebook
Twitter
LinkedIn
Email
WhatsApp
Pocket
Reddit