Data Lake and Warehouse Knowledge base
Welcome to the Data Lake and Warehouse Knowledge Base. Here, you'll find everything you need to understand and utilise our Data Lake and Warehouse, from its architecture and data schema to access instructions and data refresh schedules. Whether you need guidance on querying data or understanding integration with the Data Lake or Warehouse, this knowledge base will provides answers.
Introduction: What is a Data Lake?
A Data Lake is a centralised repository designed to store vast amounts of raw, structured, and unstructured data. It enables businesses to run analytics, AI, and machine learning on large datasets.
Visual representation of the Data Lake’s components, including data ingestion, storage, processing layers, and integration with other systems.
Explanation of the structure and organisation of data within the Data Lake. Includes details on how data is categorised, key tables, and schema relationships.
Accessing Data Lake and Warehouse
Instructions on how to connect and interact with the Data LakeHouse, covering permissions, query methods, and available tools.
Details on when and how often data in the Data Lake is refreshed, ensuring users know the latest data availability.
Information on the expected volume of data, including guidance on how many rows you may need to work with, depending on your use case.