Data Lake and Warehouse

Data Lake and Warehouse

Data Lake and Warehouse

Overview

Learn Amp's Data Lake and Warehouse provides powerful, flexible access to your learning data for advanced reporting, analytics, and integration. Built on enterprise-grade AWS Redshift infrastructure, it's designed to give you the insights you need to drive learning outcomes.

💡 Tip: This feature is available with the Data Lake bolt-on or as part of the Advanced Analytics package. Contact your Customer Success Manager to learn more.


Functionality Breakdown

How It Works

All customers connect to our AWS Redshift database using a PostgreSQL-compatible client. From there, you can either:

  1. Query directly for real-time dashboards and reports (Data Warehouse approach)

  2. Export to your preferred format such as Apache Parquet for advanced analytics, machine learning, or custom data pipelines (Data Lake approach)

Use Case

Connection

What You Do

Use Case

Connection

What You Do

BI Dashboards & Reporting

Connect directly to Redshift

Query the data warehouse in real-time for structured reporting

Advanced Analytics / ML

Connect to Redshift, then export

Pull data from Redshift and export to Apache Parquet (or other formats) for data science workloads

What You Get

  • Direct Redshift access via secure, company-specific schemas

  • Hourly data refresh to keep insights current

  • Self-service dashboards embedded in Learn Amp

  • Export flexibility - pull data to Apache Parquet, CSV, or any format your tools require


Pre-requisites

Role Requirements

This is a provisioned service—contact your Customer Success Manager to enable access.

Action

Who Can Help

Action

Who Can Help

Enable Data Lake access

Contact your Customer Success Manager

Receive credentials

Provided to designated Primary Contact

Request IP whitelisting

Submit via Customer Support Portal

Technical Requirements

  • PostgreSQL-compatible client (Power BI, Tableau, DBeaver, psql)

  • Outbound TCP port 5439 open on your firewall

  • Static IP addresses for whitelisting (up to 5)

  • DNS resolution for bi.la-dl.com


Quick Start Guide

  1. Contact your CSM to enable Data Lake access

  2. Follow the setup guide: Getting Started with Data Lake

  3. Review the data schema: Data LakeHouse: Data Schema

  4. Connect your BI tool using the provided Redshift credentials

  5. For advanced analytics: Export your query results to Apache Parquet or your preferred format


FAQs

How do I connect to the data?

All access is via our AWS Redshift database. Use any PostgreSQL-compatible client (Power BI, Tableau, DBeaver, psql) with the credentials provided by your CSM.

What's the difference between Data Warehouse and Data Lake usage?

Both connect to the same Redshift database. "Data Warehouse" typically means querying directly for BI reports. "Data Lake" refers to exporting that data to formats like Apache Parquet for use in data science, machine learning, or custom analytics pipelines.

How often is data refreshed?

Data refreshes hourly through our ETL pipeline.

What BI tools can I use?

Any PostgreSQL-compatible tool: Power BI, Tableau, Looker, DBeaver, or even command-line psql.

Can I export data to other formats?

Yes! Once connected to Redshift, you can export query results to Apache Parquet, CSV, JSON, or any format your tools support.

Is the Data Lake still in Beta?

Yes, we're actively evolving the platform based on customer feedback.


Troubleshooting

Issue

Solution

Issue

Solution

Cannot connect

Verify port 5439 is open and your IP is whitelisted

Credentials not working

Check spelling—passwords are case-sensitive

Missing data

Allow up to 1 hour for ETL refresh

Need more than 5 IPs

Submit a request via Customer Support Portal with justification


In This Section