Getting Started with Data Lake

Getting Started with Data Lake

Overview

This guide provides everything you need to securely connect to your Learn Amp Data Lake—built on AWS Redshift. Whether you're setting it up for the first time or troubleshooting access, you'll find all the technical requirements and step-by-step instructions here.

💡 Looking to enable Data Lake? Contact your Customer Success Manager to explore how this feature can enhance your reporting and data integration capabilities.


Functionality Breakdown

What is Learn Amp Data Lake?

The Learn Amp Data Lake is part of our modern Data LakeHouse architecture, designed to deliver scalable, flexible access to learning data for advanced reporting, analytics, and integration.

Feature

Description

Feature

Description

Advanced Analytics

Self-service dashboards and reporting tools embedded in Learn Amp

Direct Data Access

Secure, customer-specific schemas for BI tools, HRIS platforms, and AI pipelines

Custom ETL Pipeline

Optimised for analytics workloads, transforming data into query-ready formats

⚠️ Note: The Data Lake is currently in Beta. We're actively evolving the platform based on customer feedback.

Data Refresh

Our Beta release targets an hourly ETL refresh cycle, ensuring your data stays current throughout the day.


Pre-requisites

Role Requirements

This is a provisioned service—there are no in-app settings to configure.

Action

Who Can Help

Action

Who Can Help

Enable Data Lake access

Contact your Customer Success Manager

Receive credentials

Provided to designated Primary Contact

Request IP whitelisting

Submit via Customer Support Portal

Technical Requirements

Before connecting, ensure your network meets these requirements:

Requirement

Details

Requirement

Details

Outbound TCP Port 5439

Must be open on your network firewall to allow connections to Redshift

DNS Resolution

Your network must resolve the hostname bi.la-dl.com to a valid AWS IP

IP Whitelisting

The public IP address(es) provided must be your actual egress IPs—NAT/proxy must not mask them

Antivirus/Firewall

Local security software must allow traffic on port 5439 with no SSL interception

Proxy Configuration

If using a proxy, it must permit access to bi.la-dl.com on port 5439

Why IP Whitelisting Matters

To safeguard your data, access to the Data Lake is strictly controlled through IP Whitelisting. Only approved public IP addresses can connect—this aligns with the Principle of Least Privilege and minimises security risks.


Quick Start Guide

Step 1: Gather Required Details

Collect the following before submitting your request:

  • Your Learn Amp subdomain

  • Public IP address(es) to be whitelisted (not internal/dynamic IPs)

  • Primary Contact name and email for receiving credentials

Step 2: Submit Your Request

Raise a ticket via the Customer Support Portal including:

  • Your subdomain

  • List of IP addresses (up to 5)

  • Primary Contact details

Step 3: Provisioning

We will:

  • Create a read-only Data Lake user for your subdomain

  • Whitelist your provided IP addresses

  • Send connection credentials securely to your Primary Contact

You'll receive: hostname, database name, username, and password.

Step 4: Test Your Connection

Use any PostgreSQL-compatible client to connect:

  • Hostname: bi.la-dl.com

  • Port: 5439

  • Database: Provided in credentials (e.g., prod_eu1)

  • Username: Provided in credentials

  • Password: Provided in credentials

💡 Tip: Store your credentials securely—they'll be needed for all future connections.


Example Connection

Using psql Command Line

# Connect using psql client psql -d prod_eu1 -U datalake_access_<subdomain>_db_user -h bi.la-dl.com -p 5439 # List all available tables in your schema \dt <subdomain>.* # View data in a table SELECT * FROM <subdomain>.tags LIMIT 10;

Replace <subdomain> with your actual Learn Amp subdomain.


FAQs

What if I use a NAT Gateway or proxy?

If your outbound traffic routes through NAT or a proxy, the public IP seen by external services is your NAT/proxy egress point. This is the IP to whitelist—and it must be static. If your proxy performs TLS inspection, configure an exception for TCP 5439 traffic to bi.la-dl.com.

What if I need more than 5 IP addresses?

Our standard policy allows up to 5 IPs per customer. If you need more (e.g., multiple offices or large analyst teams), submit a request via the Customer Support Portal with a brief justification explaining your network architecture.

What BI tools can I use?

Any PostgreSQL-compatible tool: Power BI, Tableau, Looker, DBeaver, or command-line psql.

How often is data refreshed?

Data refreshes hourly. Actual times may vary during Beta as we optimise performance.


Troubleshooting

Issue

Solution

Issue

Solution

Connection refused

Verify port 5439 is open on your firewall and your IP is whitelisted

Authentication failed

Double-check credentials—usernames and passwords are case-sensitive

Cannot resolve hostname

Check DNS resolution for bi.la-dl.com. Try nslookup bi.la-dl.com

Timeout during connection

Check proxy/firewall settings. Ensure no SSL interception on port 5439

Need more than 5 IPs

Submit a request via Support Portal with justification


Still Need Help?

We're here to support you. If you run into issues or have questions:


Next Steps