Overview
JupyterLab launcher landing
The JupyterLab launcher landing page after first boot, with the bundled starter notebook visible in the file browser.
JupyterLab launcher landing
Starter notebook open
Analytical query result
This is a repackaged open source software product wherein additional charges apply for cloudimg support services.
Why This AMI Over Manual Setup
Installing DuckDB, configuring JupyterLab, setting up nginx authentication, and preparing sample data typically takes hours of manual work. This AMI eliminates that effort entirely - launch an instance and run your first analytical query within minutes, not hours. Every component is pre-integrated and secured with per-instance credentials generated automatically on first boot, so there are no shared or default passwords to worry about.
Overview
DuckDB is an open source, in-process analytical database engine designed for fast queries against large columnar datasets. With over 6 million monthly downloads across its community, DuckDB has become a leading choice for OLAP workloads. This image ships DuckDB 1.5 inside a complete analytics environment so you can connect, load data, and run queries within minutes of launch.
Application Stack
- DuckDB CLI installed system-wide on every user's PATH
- JupyterLab notebook server pre-configured with Python 3.12, the DuckDB Python client, pandas, and PyArrow
- nginx on port 80 with HTTP basic authentication fronting JupyterLab
- SSH access for terminal-driven analytics via the DuckDB CLI
Sample Dataset and Starter Notebook
A one-million-row New York City yellow taxi trips parquet file is bundled on a dedicated data disk. A starter notebook opens a persistent DuckDB database against the parquet file and runs three analytical queries so you can see the engine in action before writing any code.
Security and Access Control
On first boot, a one-shot service generates a fresh JupyterLab administrator password unique to that instance, writes it into the nginx HTTP basic authentication store, and stores the plain-text value in a root-only file accessible via SSH. No shared or default credentials ship in the image. The dedicated storage volume can leverage EBS encryption for data at rest. Buyers requiring HTTPS should configure a TLS certificate on the nginx frontend or place the instance behind an AWS Application Load Balancer with TLS termination.
Dedicated Storage Tier
DuckDB databases, notebooks, and sample data live on a separate, independently resizable storage volume kept off the operating system disk. This means your analytics tier can grow without disturbing the rest of the instance - scale storage as your datasets expand without reprovisioning.
Use Cases
- Ad hoc analytics on parquet, CSV, and JSON files
- Local data warehouse andBI prototyping
- Querying data on Amazon S3 directly with DuckDB's httpfs extension
- Embedded analytics inside notebooks and Python applications
- Single-node OLAP for departmental reporting and finance teams
Getting Started
- Launch the AMI on your preferred EC2 instance type
- Ensure your security group allows inbound traffic on port 80 (HTTP) and port 22 (SSH)
- SSH into the instance and retrieve the generated password from /root/.jupyter_password
- Browse to the instance's public IP address and sign in with username "admin" and the retrieved password
- Open the starter notebook and run the bundled analytical queries
- Use the DuckDB CLI over SSH for terminal-driven workflows
cloudimg Support
24/7 technical support by email and live chat. Our engineers provide expert assistance with DuckDB deployment, notebook configuration, dataset loading, performance tuning, and engine upgrades. Critical issues receive a one-hour average response time.
All product and company names are trademarks or registered trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
Highlights
- Skip hours of manual setup - DuckDB 1.5, JupyterLab with Python 3.12, nginx authentication, and a sample parquet dataset are pre-integrated and launch-ready. Unique per-instance credentials are generated automatically on first boot with no shared or default passwords, giving you stronger security than a default manual installation.
- Run your first analytical query within minutes of launch using the bundled one-million-row NYC taxi dataset and starter notebook on a dedicated, independently resizable storage volume. Scale your analytics data without touching the OS disk - something that requires custom partitioning in a self-managed setup.
- 24/7 expert support from cloudimg with one-hour average response for critical issues. Engineers specialize in DuckDB deployment, notebook configuration, dataset loading, and performance tuning - dedicated expertise you would not get from generic cloud support plans.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
- ...
Dimension | Description | Cost/hour |
|---|---|---|
m5.large Recommended | m5.large | $0.08 |
t3.micro | t3.micro instance type | $0.04 |
t2.micro | t2.micro instance type | $0.04 |
m8azn.6xlarge | m8azn.6xlarge instance type | $0.24 |
m7a.metal-48xl | m7a.metal-48xl instance type | $0.24 |
i4i.24xlarge | i4i.24xlarge instance type | $0.24 |
c8a.4xlarge | c8a.4xlarge instance type | $0.24 |
r8id.16xlarge | r8id.16xlarge instance type | $0.24 |
m8i-flex.large | m8i-flex.large instance type | $0.08 |
c6a.2xlarge | c6a.2xlarge instance type | $0.24 |
Vendor refund policy
Refunds available on request.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
Initial release of DuckDB 1.5 in a JupyterLab notebook environment on AWS.
Additional details
Usage instructions
Connect via SSH on port 22 as the default login user for your operating system variant (the user guide lists it per variant). The DuckDB CLI is on the PATH, so 'duckdb /opt/duckdb/samples/main.duckdb' opens a session against the bundled database. JupyterLab is served on port 80 behind HTTP basic authentication; browse to http://<instance-public-ip>/ and sign in as 'duckdb'. Retrieve the generated password with: sudo cat /stage/scripts/duckdb-credentials.log. The starter notebook 01-duckdb-quickstart.ipynb opens the parquet sample dataset and runs three example queries. Restrict port 80 to trusted networks because JupyterLab can execute arbitrary Python; to enable HTTPS, follow the reverse proxy section of the user guide.
Resources
Vendor resources
Support
Vendor support
cloudimg provides 24/7 technical support for this DuckDB AMI product through two channels:
Email: support@cloudimg.co.uk Live Chat: Available around the clock
Response Times:
- Critical issues: One-hour average response time
What We Help With:
- DuckDB deployment and engine upgrades
- JupyterLab notebook configuration
- Dataset loading and parquet file management
- Performance tuning for your workload
- Troubleshooting connectivity, authentication, and query issues
- Guidance on DuckDB's httpfs extension for querying Amazon S3 data
- Storage volume resizing and management
Getting Started Support: If you need help retrieving your instance credentials, configuring security groups, or connecting to JupyterLab for the first time, our engineers can walk you through the process.
Refunds: For refund requests, contact support@cloudimg.co.uk with your AWS Marketplace order details and we will assist you promptly.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.