How to work properly

Docker & Containerization Guide

What is a Container?

A container is an isolated process running on your system, made possible through kernel namespaces—a long-standing Linux feature. Docker simplifies container creation and management.

Key Characteristics:

A runnable instance of a Docker image
Runs on local machines, VMs, or cloud platforms
Portable across operating systems
Isolated: runs its own software, binaries, and configs independently of others

What is a Container Image?

A container image is a packaged, read-only filesystem containing everything needed to run an application:

Dependencies
Configuration files
Scripts
Binaries
Environment variables
Default execution commands
Metadata

When a container starts, it uses this image as its base filesystem.

How to Containerize an Application

To containerize an app:

Create a Dockerfile in your project root directory (no file extension).

Add instructions in the Dockerfile:

FROM        # Base image
WORKDIR     # Working directory inside the container
COPY        # Copy local files to the image
RUN         # Execute commands during image build
CMD         # Default command to run on container start
EXPOSE      # Ports to expose

FROM # Base image

WORKDIR # Working directory inside the container

COPY # Copy local files to the image

RUN # Execute commands during image build

CMD # Default command to run on container start

EXPOSE # Ports to expose

Build the Docker image:

docker build -t getting-started .

1
2

docker build -t getting-started .
Run the container:

docker run -dp 3000:3000 getting-started

1
2

docker run -dp 3000:3000 getting-started
- -d: detached mode (runs in the background)
- -p: maps host port to container port

To view running containers, use Docker Desktop or:

docker ps

docker ps

Persisting Data with Volumes

By default, container data is ephemeral. To persist data:

Create a Docker volume
Mount it during container launch

This ensures data survives restarts and deletions.

Note: Docker Desktop is a paid application; Docker on Linux is free.

Recommended Development Setup

Virtual Environment (Python)

Create a working directory
Create a virtual environment:

python -m venv env_name

1
2

python -m venv env_name
Activate it:

.\env_name\Scripts\activate

1
2

.\env_name\Scripts\activate

Version Control with Git

Initialize your repo:

git init

1
2

git init
Make your first commit
Use VS Code Git extension for easier Git management

Python Version Management with Pyenv

Use pyenv to manage multiple Python versions

Suggested Project Structure

my-saas-app/
│
├── backend/              # Flask app
│   ├── app/              # Core logic
│   │   ├── __init__.py
│   │   ├── routes.py
│   │   ├── models.py
│   │   ├── services.py
│   │   └── utils.py
│   ├── config.py
│   ├── requirements.txt
│   └── Dockerfile
│
├── frontend/             # React app
│   ├── public/
│   ├── src/
│   │   ├── components/
│   │   ├── pages/
│   │   ├── App.js
│   │   └── index.js
│   ├── package.json
│   └── Dockerfile
│
├── db/
│   ├── neo4j/
│   └── mysql/
│
├── docker-compose.yml
└── README.md

my-saas-app/

│

├── backend/ # Flask app

│ ├── app/ # Core logic

│ │ ├── __init__.py

│ │ ├── routes.py

│ │ ├── models.py

│ │ ├── services.py

│ │ └── utils.py

│ ├── config.py

│ ├── requirements.txt

│ └── Dockerfile

│

├── frontend/ # React app

│ ├── public/

│ ├── src/

│ │ ├── components/

│ │ ├── pages/

│ │ ├── App.js

│ │ └── index.js

│ ├── package.json

│ └── Dockerfile

│

├── db/

│ ├── neo4j/

│ └── mysql/

│

├── docker-compose.yml

└── README.md

VS Code Best Practices

Explorer: Navigate project files
Search: Quickly find code
Source Control: Manage Git changes
Recommended Plugins:
- Docker
- Python
- Pylance
- Powershell
- Dataiku DSS

Tech Stack Overview

Component	Tools
Database	Neo4j, MySQL, Amazon RDS
Frontend	React.js, Vue.js (with TypeScript)
Backend	Flask API, Django
CI/CD & Git	GitHub, GitLab
Process Manager	Supervisor, Circus
Data Pipelines	Apache Airflow
Cloud	AWS EC2, S3, Docker
Auth	Auth0
Payments	Stripe
Analytics	Google Analytics

Use .env or .secrets files for storing credentials and API keys.

️ Graph Database: Neo4j & Amazon Neptune

Neo4j (Local via Docker)

docker-compose.yml:

version: "3.8"
services:
  neo4j:
    image: neo4j:5.8.0
    container_name: neo4j-graph
    ports:
      - 7474:7474
      - 7687:7687
    environment:
      - NEO4J_AUTH=none
      - NEO4J_dbms_memory_heap_max__size=4g
    volumes:
      - ./data/:/var/lib/neo4j/import/
      - ./data/:/var/lib/neo4j/data/

version: "3.8"

services:

neo4j:

image: neo4j:5.8.0

container_name: neo4j-graph

ports:

- 7474:7474

- 7687:7687

environment:

- NEO4J_AUTH=none

- NEO4J_dbms_memory_heap_max__size=4g

volumes:

- ./data/:/var/lib/neo4j/import/

- ./data/:/var/lib/neo4j/data/

Import data inside container:

bin/neo4j-admin database import full \
--nodes="import/NODES.csv" \
--relationships="import/RELATIONSHIPS.csv" \
--delimiter="," \
--array-delimiter="\t" \
--skip-bad-relationships=true \
--overwrite-destination \
--multiline-fields=true \
neo4j

bin/neo4j-admin database import full \

--nodes="import/NODES.csv" \

--relationships="import/RELATIONSHIPS.csv" \

--delimiter="," \

--array-delimiter="\t" \

--skip-bad-relationships=true \

--overwrite-destination \

--multiline-fields=true \

neo4j

Amazon Neptune (Managed Graph DB)

Supports Tinkerpop Gremlin, openCypher
Use SSL and curl to query endpoints
Use Bulk Loader for high-volume ingestion
Connect via Lambda, use SageMaker for Neptune ML

Neptune ML supports:

Node classification & regression
Edge classification & regression
Link prediction

Docker Commands Cheat Sheet

Run a container:

docker run -d -p 8080:80 nginx

1 2	docker run -d -p 8080:80 nginx

Enter a container:

docker exec -it &lt;container_id&gt; /bin/bash

1 2	docker exec -it <container_id> /bin/bash

Stop and remove container:

docker stop &lt;id&gt;
docker rm &lt;id&gt;

docker stop <id>

docker rm <id>

Pull image:

docker pull hello-world

1 2	docker pull hello-world

List running containers:

docker ps

docker ps

List all images:

docker images -a

1 2	docker images -a

System cleanup:

docker system prune

1 2	docker system prune

⚙️ Git Workflow Summary

git clone &lt;repo_url&gt;          # Clone repo
git checkout -b feature-xyz   # Create and switch to new branch
git add .                     # Stage changes
git commit -m "Message"       # Commit changes
git push -u origin feature-xyz

git clone <repo_url> # Clone repo

git checkout -b feature-xyz # Create and switch to new branch

git add . # Stage changes

git commit -m "Message" # Commit changes

git push -u origin feature-xyz

Open Merge Request when done
Always test before merging into master or main

.gitignore Examples:

venv/
*.pyc
__pycache__/
.env
.secrets
node_modules/

venv/

*.pyc

__pycache__/

.env

.secrets

node_modules/