Skip to main content

Prerequisites

Before you begin, make sure you have the following installed:
  • Python 3.11 or higher — required for all development work
  • uv — the recommended package manager for this project
  • git — for version control and submitting pull requests

Setting up your environment

1

Fork and clone the repository

Fork the repository on GitHub, then clone your fork locally:
git clone https://github.com/YOUR_USERNAME/archeo-cluster.git
cd archeo-cluster
2

Install dev dependencies

Install all dependencies including the dev extras:
uv sync --extra dev
The dev extras include pytest, ruff, mypy, pre-commit, and type stubs. See pyproject.toml for the full list.
Alternatively, using pip:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"
3

Install pre-commit hooks

Set up the pre-commit hooks so linting and formatting run automatically before each commit:
uv run pre-commit install
The pre-commit configuration runs ruff (lint and format), mypy, and a set of general file checks on every commit.
4

Verify your installation

Run all checks to confirm everything is working:
make check
This runs linting, type checking, and the full test suite.

Project structure

src/archeo_cluster/
├── cli/           # Typer CLI commands
├── core/          # Business logic
│   ├── detection/ # OpenCV object detection
│   ├── clustering/# K-Means analysis
│   ├── spatial/   # Spatial statistics
│   └── output/    # Session management and storage
├── models/        # Pydantic models
├── utils/         # Shared utilities
└── config/        # Configuration management
tests/
├── unit/          # Unit tests (mirror src structure)
├── integration/   # Integration tests
└── conftest.py    # Shared pytest fixtures

Running tests

uv run pytest
The default pytest configuration in pyproject.toml already enables coverage reporting. Running uv run pytest without extra flags will include a coverage summary in the terminal output.

Linting and formatting

This project uses ruff for linting and formatting, and mypy for static type checking.
make lint

Make targets reference

TargetDescription
make installInstall dependencies with uv
make testRun tests with pytest
make lintRun ruff and mypy
make formatFormat and auto-fix with ruff
make checkRun lint then test
make cleanRemove build artifacts and caches
make buildBuild the package

Code style guidelines

Import order

Always follow this import grouping order:
"""Module docstring describing purpose."""

from __future__ import annotations       # Always first

import logging                            # Standard library
from pathlib import Path
from typing import TYPE_CHECKING

import cv2                                # Third-party
import numpy as np

from archeo_cluster.models import Config  # First-party (archeo_cluster.*)

if TYPE_CHECKING:                         # Type-only imports last
    from numpy.typing import NDArray

logger = logging.getLogger(__name__)      # Module-level logger
Always use from __future__ import annotations as the very first import. Use absolute imports (from archeo_cluster.models import X), and avoid star imports.

Type annotations

All functions must have complete type hints:
  • Use str | None instead of Optional[str]
  • Use list[str] instead of List[str] (built-in generics)
  • Use NDArray[np.uint8] for NumPy arrays
  • Use Pydantic models for configuration and data transfer
mypy is configured in strict mode — all unannotated functions will fail the type check.

Google-style docstrings

def process_image(
    image: NDArray[np.uint8],
    config: DetectionConfig,
) -> DetectionResult:
    """Process an image and detect archaeological objects.

    Args:
        image: Input image in BGR format.
        config: Detection configuration parameters.

    Returns:
        DetectionResult containing detected objects and features.

    Raises:
        ValueError: If image is empty or has invalid dimensions.
    """

Naming conventions

KindConventionExample
ClassesPascalCaseObjectDetector, DetectionConfig
Functions and methodssnake_casefilter_contours, extract_features
Variablessnake_caseimage_path, cluster_count
ConstantsUPPER_SNAKE_CASECLUSTER_COLORS, SUPPORTED_EXTENSIONS
Private membersSingle underscore prefix_save_intermediate_images

Path handling

Always use pathlib.Path — never string concatenation for file paths:
# Correct
output_dir = Path("results") / "session_01"
output_dir.mkdir(parents=True, exist_ok=True)

# Incorrect
output_dir = "results/" + "session_01"

Logging

Use the logging module instead of print():
# Correct
logger = logging.getLogger(__name__)
logger.info("Processing image: %s", image_path)

# Incorrect
print(f"Processing image: {image_path}")

Pull request guidelines

1

Create a feature branch

git checkout -b feature/your-feature-name
2

Make your changes

Write clear, atomic commits. Follow Conventional Commits format:
PrefixUse for
feat:New feature
fix:Bug fix
docs:Documentation changes
refactor:Code refactoring (no functional changes)
test:Adding or updating tests
chore:Maintenance tasks
Examples:
feat: add CSV export for clustering results
fix: handle zero-area contours in centroid calculation
docs: update installation instructions for uv
refactor: extract color conversion to utility module
test: add unit tests for elbow method
3

Ensure all checks pass

make check
4

Push and open a pull request

Push your branch and create a pull request on GitHub. Fill out the PR template completely and address review feedback promptly.

Reporting issues

Bug reports

Include your Python version, operating system, package version (archeo-cluster --version), steps to reproduce, expected vs actual behavior, and any error tracebacks.

Feature requests

Describe the use case, your proposed solution if any, and alternatives you have considered.
Open a Discussion for questions, or an Issue for bugs and feature requests.