How can I request access to the Data Lake?
Access to the Data Lake is provided to customers who have purchased the Data lake Bolt-on and to customers on “Advanced” package. Please contact your Customer Success Manager if you are interested in access to the Data Lake.
This guide illustrates the steps needed to provide you with access to the Data Lake.
Preliminary considerations: Security Best Practices
To ensure the security of your Data Lake, we follow the principle of least privilege. This means we only whitelist the necessary IPs to access the Data Lake. Please ensure that:
Only the specific IPs or IP ranges that will need access to the Data Lake are provided. We recommend whitelisting only a small number of IPs for the most secure access.
If you are unsure which IPs to whitelist, please consult with your IT team to verify the minimal set of IPs needed for access.
Avoid providing large IP ranges, unless absolutely necessary. Whitelisting large ranges increases the attack surface and poses unnecessary security risks.
Redshift Endpoint Access: In addition to whitelisting your IPs, ensure your firewall is configured to allow access to the Redshift endpoint.
IP Whitelisting Policy for Datalake Access
To enhance security and ensure efficient access control, we enforce a strict IP whitelisting policy:
Minimal Access Approach
Customers must specify only the necessary users or machines requiring access.
Instead of large CIDR blocks, only the exact IPs of authorized users/machines should be provided.
This minimizes security risks and ensures controlled access.
Maximum 5 IPs Per Customer
Each customer can whitelist up to 5 IP addresses for accessing the datalake.
Requests exceeding this limit will need to be analysed and approved.
Customers should carefully choose the most essential IPs for access.
This policy ensures a secure and efficient way for customers to access the datalake while preventing excessive exposure.
Process to request access
Step 1: Gather the Required Information
Ensure you have the following details prepared:
Your Subdomain: The subdomain associated with your company in our system.
Your IP Address: Access to the Data Lake is restricted to whitelisted IPs. Provide all the IP addresses that will be used to access the Data lake by all users that will be authorized to do so in your organization.
Point of Contact: Identify the primary contact person at your organization for this request. This person will receive secure credentials for access.
Step 2: Submit Your Request
To request access:
Create a Support question in our customer portal.
Include the following in your request:
Your company subdomain.
The IP addresses to whitelist.
The name and email of the primary contact person.
Step 3: Access Provisioning
Once your request is received:
A read-only user will be created for your subdomain in the Data Lake.
Connection credentials, including the username and connection details, will be securely shared with you via email.
The IP addresses provided will be whitelisted.
Note: The credentials will be stored securely in our system, but ensure you save them securely for your use.
Step 5: Test Access
After receiving your credentials:
Use the provided credentials to verify access to the schema and tables. You can use a psql connection, example below:
# connect using psql client
psql -d prod_eu1 -U datalake_access_<subdomain>_db_user -h bi.la-dl.com -p 5439
# list all of the available tables of your schema
/dt <subdomain>.*
# view data in a table
select * from <subdomain>.tags limit 10;
If you encounter any issues or have any question, please comment on your open ticket or raise a new support ticket.
Client-Side Requirements for Successful Redshift Access
🔒 Network Configuration
Allow Outbound Access to Redshift Endpoint:
Port: TCP 5439 (default Redshift port)
Protocol: TCP
Destination: Your Redshift cluster endpoint (redshift-cluster.czare90wicld.eu-west-1.redshift.amazonaws.com)
All whitelisted public IPs must route through their actual egress IP addresses
Ensure Actual Public IP Matches Whitelisted IPs:
Clients may use NAT or proxy gateways — ensure outbound traffic is coming from the whitelisted public IP(s).
No Proxy Blocking Redshift:
If outbound connections go through a proxy, it must allow:
TCP 5439
Connections to Redshift domain/IPs
Some proxies break SSL or require explicit allowlisting of domains
DNS Must Work:
Redshift is hostname-based. Their DNS must resolve your Redshift hostname to a valid AWS public IP (176.34.231.221) (use
nslookup
ordig
).Internal DNS or firewall DNS overrides may interfere
🛡️ Firewall / Security Gateway
Permit Egress to AWS Redshift IP Range (Region-Specific):
Ensure their firewall doesn't block outbound traffic to AWS public IPs for Redshift (you can find IP ranges in AWS IP Ranges JSON)
Disable Deep Packet Inspection / SSL Termination:
Redshift uses TLS for secure connections; DPI tools may block it or degrade performance
Whitelist Domain/Service:
Add your Redshift endpoint to allowlist in any:
Web filters
Security gateways (like ZScaler, Palo Alto, etc.)
Proxy-based controls
Antivirus / Endpoint Protection:
Some corporate antivirus software blocks unknown ports like 5439 or drops packets silently
Redshift must be added to "trusted domains" if needed
📊 Client Tools (e.g. PowerBI, Tableau, DBeaver)
Install Redshift ODBC/JDBC Driver:
Use latest AWS Redshift driver from Amazon's official source
Allow the Tool to Connect Unrestricted:
Ensure firewall/antivirus allows PowerBI/Tableau/DBeaver to initiate outbound TCP traffic
Avoid using intermediate network gateways that could alter requests
Use Correct Connection String:
Example:
redshift-cluster.czare90wicld.eu-west-1.redshift.amazonaws.com:5439/prod_eu1
redshift-cluster.czare90wicld.eu-west-1.redshift.amazonaws.com:5439/prod_eu2
📋 Summary Checklist
Setting | Required Configuration |
---|---|
Outbound Port 5439 | Must be open |
DNS Resolution | Must resolve to AWS Redshift public IP (176.34.231.221) |
Firewall | Must not block AWS/Redshift traffic |
Proxy | Must allow Redshift endpoint and not intercept TLS |
Public IP | Must match whitelisted IPs |
Antivirus/EDR | Must not block outbound database traffic |
BI Tool Setup | Correct driver + correct connection string |