Data Lake and Warehouse
Data Lake and Warehouse
Overview
Learn Amp's Data Lake and Warehouse provides powerful, flexible access to your learning data for advanced reporting, analytics, and integration. Built on enterprise-grade AWS Redshift infrastructure, it's designed to give you the insights you need to drive learning outcomes.
💡 Tip: This feature is available with the Data Lake bolt-on or as part of the Advanced Analytics package. Contact your Customer Success Manager to learn more.
Functionality Breakdown
How It Works
All customers connect to our AWS Redshift database using a PostgreSQL-compatible client. From there, you can either:
Query directly for real-time dashboards and reports (Data Warehouse approach)
Export to your preferred format such as Apache Parquet for advanced analytics, machine learning, or custom data pipelines (Data Lake approach)
Use Case | Connection | What You Do |
|---|---|---|
BI Dashboards & Reporting | Connect directly to Redshift | Query the data warehouse in real-time for structured reporting |
Advanced Analytics / ML | Connect to Redshift, then export | Pull data from Redshift and export to Apache Parquet (or other formats) for data science workloads |
What You Get
Direct Redshift access via secure, company-specific schemas
Hourly data refresh to keep insights current
Self-service dashboards embedded in Learn Amp
Export flexibility - pull data to Apache Parquet, CSV, or any format your tools require
Pre-requisites
Role Requirements
This is a provisioned service—contact your Customer Success Manager to enable access.
Action | Who Can Help |
|---|---|
Enable Data Lake access | Contact your Customer Success Manager |
Receive credentials | Provided to designated Primary Contact |
Request IP whitelisting | Submit via Customer Support Portal |
Technical Requirements
PostgreSQL-compatible client (Power BI, Tableau, DBeaver, psql)
Outbound TCP port 5439 open on your firewall
Static IP addresses for whitelisting (up to 5)
DNS resolution for
bi.la-dl.com
Quick Start Guide
Contact your CSM to enable Data Lake access
Follow the setup guide: Getting Started with Data Lake
Review the data schema: Data LakeHouse: Data Schema
Connect your BI tool using the provided Redshift credentials
For advanced analytics: Export your query results to Apache Parquet or your preferred format
FAQs
How do I connect to the data?
All access is via our AWS Redshift database. Use any PostgreSQL-compatible client (Power BI, Tableau, DBeaver, psql) with the credentials provided by your CSM.
What's the difference between Data Warehouse and Data Lake usage?
Both connect to the same Redshift database. "Data Warehouse" typically means querying directly for BI reports. "Data Lake" refers to exporting that data to formats like Apache Parquet for use in data science, machine learning, or custom analytics pipelines.
How often is data refreshed?
Data refreshes hourly through our ETL pipeline.
What BI tools can I use?
Any PostgreSQL-compatible tool: Power BI, Tableau, Looker, DBeaver, or even command-line psql.
Can I export data to other formats?
Yes! Once connected to Redshift, you can export query results to Apache Parquet, CSV, JSON, or any format your tools support.
Is the Data Lake still in Beta?
Yes, we're actively evolving the platform based on customer feedback.
Troubleshooting
Issue | Solution |
|---|---|
Cannot connect | Verify port 5439 is open and your IP is whitelisted |
Credentials not working | Check spelling—passwords are case-sensitive |
Missing data | Allow up to 1 hour for ETL refresh |
Need more than 5 IPs | Submit a request via Customer Support Portal with justification |
In This Section