What capabilities does Microsoft Fabric offer?
Microsoft Fabric offers a robust set of features designed to address a wide range of data and analytics needs, empowering organizations to unlock the full potential of their data.
- Data Engineering: Develop systems that help you efficiently organize and analyze large volumes of data.
- Data Science: Leverage AI tools to create extensive workflows, enhance your data, and extract deeper insights.
- Data Warehouse: Scale your computing and storage capabilities independently with superior SQL performance.
- Real-time Intelligence: Quickly explore, analyze, and respond to streaming data, ensuring high performance with minimal latency.
- Business Intelligence: Convert your data into visually compelling, interactive insights and integrate them seamlessly with Microsoft 365.
- Copilot in Microsoft Fabric: Enhance your productivity and creativity using natural-language prompts within notebooks, pipelines, and reports.
What are the key parts of Microsoft Fabric?
Microsoft Fabric integrates several essential components, each contributing to the creation of a modern data ecosystem. By using these components, your organization can streamline its data processes and maximize the value of your data assets, aligning with Amplifi's value chain messaging.
OneLake serves as a central repository for both raw and processed data, functioning as a blob storage system. It houses data in both lakehouses and warehouses for analytics use cases, with processed data becoming data products that can be surfaced externally. Additionally, OneLake includes warehouse components.
Synapse, Microsoft's encapsulation of Apache Spark, provides a set of compute resources for data engineers. It allows you to process data using languages such as Python or SQL, enabling robust data transformation and analysis.
Data Factory is a cloud-based ETL and data integration service that enables your organization to create data-driven workflows. These workflows orchestrate data movement and transformation at scale, allowing the ingestion of data from disparate sources into the Microsoft Fabric ecosystem.
PowerBI enables users to visualize and share insights derived from data within Microsoft Fabric. It showcases the culmination of integration, engineering, and data science efforts, providing interactive and visually compelling reports.
Data Activator offers a no-code experience that automatically takes actions when patterns or considerations are detected in changing data. It monitors data in PowerBI reports and Eventstreams, triggering appropriate actions such as alerting users when thresholds are met, or patterns are matched.
Data Science empowers users to complete end-to-end data science workflows, enhancing data and deriving business insights. Activities range from data exploration, preparation, and cleansing to experimentation, modelling, model scoring, and serving predictive insights to BI reporting.
Microsoft Fabric and Purview
Microsoft Purview supports your organization in governing, protecting, and managing its data through a unified platform. It addresses data security, governance, and risk & compliance. MicrosoftPurview includes features such as a data catalog for assets within Microsoft Fabric and is available as a standalone solution for external data.
Medallion architecture in Microsoft Fabric
The medallion architecture provides a systematic and effective approach to managing data within a lakehouse. By dividing data into bronze, silver, and gold tiers, organizations can streamline data processes, ensure transparency, and enhance performance. This step-by-step improvement, along with governance, supports advanced analytics and machine learning initiatives.
The medallion architecture consists of three distinct data layers:
Bronze Layer:
- Raw data is collected from various sources such as databases, APIs, and files.
- Data pipelines are used to ingest, validate, and load this information.
- Metadata, including load timestamps and process IDs, is recorded.
Silver Layer:
- Some transformations are applied to the data in this layer.
- Data is cleaned, normalized and standardized, ensuring quick and efficient processing.
- The emphasis is on ELT (extract, load, transform) to expedite data movement into the lakehouse.
- The silver layer is where re-usable data products live
Gold Layer:
- This layer represents the final, ready-to-use data state.
- The data is likely in a star schema to allow for efficient aggregations, and cross-referencing.
- Data scientists and analysts use this layer for reporting and visualization.
- Data in this layer does not have re-use at its core, as its value lies in delivering analytics and intelligence