Data Lake and Warehouse Knowledge base

Welcome to the Data Lake and Warehouse Knowledge Base. Here, you'll find everything you need to understand and utilise our Data Lake and Warehouse, from its architecture and data schema to access instructions and data refresh schedules. Whether you need guidance on querying data or understanding integration with the Data Lake or Warehouse, this knowledge base will provides answers.

  • Introduction: What is a Data Lake?

    • A Data Lake is a centralised repository designed to store vast amounts of raw, structured, and unstructured data. It enables businesses to run analytics, AI, and machine learning on large datasets.

  • Architecture Diagram

    • Visual representation of the Data Lake’s components, including data ingestion, storage, processing layers, and integration with other systems.

  • Data Schema

    • Explanation of the structure and organisation of data within the Data Lake. Includes details on how data is categorised, key tables, and schema relationships.

  • Accessing Data Lake and Warehouse

    • Instructions on how to connect and interact with the Data LakeHouse, covering permissions, query methods, and available tools.

  • Data Refresh Timing

    • Details on when and how often data in the Data Lake is refreshed, ensuring users know the latest data availability.

  • Data Requirements

    • Information on the expected volume of data, including guidance on how many rows you may need to work with, depending on your use case.