Skip to main content
All models are importable directly from archeo_cluster.models:
from archeo_cluster.models import (
    # Config
    AppConfig, ClusteringConfig, DetectionConfig, PathConfig,
    # Detection
    BatchDetectionResult, ContourFeatures, DetectedObject, DetectionResult,
    # Clustering
    BatchClusteringResult, ClusterInfo, ClusteringResult, ElbowResult, SilhouetteResult,
    # Performance
    PerformanceSummary, StageMetrics,
)

Config models

DetectionConfig

Pydantic BaseModel. Configuration for object detection parameters.
target_color
str
default:"#A98876"
Target color in hex format. The detector converts this to HSV and applies hue_offset, saturation_offset, and value_offset to produce the color range mask.
min_area
int
default:"50"
Minimum contour area in pixels. Must be ≥ 1.
max_area
int
default:"5000"
Maximum contour area in pixels. Must be ≥ 1.
kernel_size
tuple[int, int]
default:"(5, 5)"
Size of the morphological operation kernel used for closing and opening passes on the mask.
hue_offset
int
default:"10"
Tolerance for the hue channel in HSV color matching. Range 0–90.
saturation_offset
int
default:"50"
Tolerance for the saturation channel. Range 0–127.
value_offset
int
default:"50"
Tolerance for the value channel. Range 0–127.

ClusteringConfig

Pydantic BaseModel. Configuration for K-Means clustering parameters.
max_k
int
default:"10"
Maximum number of clusters to evaluate in the elbow method. Range 2–50.
random_state
int
default:"42"
Random seed for reproducible K-Means results.
min_samples_per_cluster
int
default:"2"
Minimum samples required to attempt clustering on an image. Must be ≥ 1.
compute_silhouette
bool
default:"True"
When True, silhouette scores are computed as a complementary validation metric alongside the elbow method.

PathConfig

Pydantic BaseModel. Configuration for file system paths.
data_dir
Path
default:"Path('data')"
Base directory for input data.
results_dir
Path
default:"Path('results')"
Directory for output results.
plots_dir
Path
default:"Path('plots')"
Directory for generated plots.
PathConfig also provides ensure_directories() which calls mkdir(parents=True, exist_ok=True) on results_dir and plots_dir.

AppConfig

Pydantic BaseModel. Top-level application configuration that combines all sub-configurations.
from archeo_cluster.models import AppConfig
from pathlib import Path

# Load from YAML
config = AppConfig.from_yaml(Path("config.yaml"))

# Or build in code
config = AppConfig(debug=True, log_level="DEBUG")
config.to_yaml(Path("config.yaml"))
detection
DetectionConfig
default:"DetectionConfig()"
Detection sub-configuration.
clustering
ClusteringConfig
default:"ClusteringConfig()"
Clustering sub-configuration.
paths
PathConfig
default:"PathConfig()"
Path sub-configuration.
debug
bool
default:"False"
Enable debug mode.
log_level
str
default:"INFO"
Logging level string (e.g. "DEBUG", "INFO", "WARNING").
Class methods
  • AppConfig.from_yaml(path: Path) -> AppConfig — load from a YAML file.
  • config.to_yaml(path: Path) -> None — save to a YAML file.

Detection models

ContourFeatures

Pydantic BaseModel. Features extracted from a single contour. All numeric fields are ≥ 0.
area
float
required
Contour area in pixels.
perimeter
float
required
Contour perimeter in pixels.
centroid_x
int
required
X coordinate of the centroid. Used for spatial analysis.
centroid_y
int
required
Y coordinate of the centroid. Used for spatial analysis.
circularity
float
required
4 * π * area / perimeter². Equals 1.0 for a perfect circle.
aspect_ratio
float
required
Bounding rectangle width divided by height.
solidity
float
required
Contour area divided by convex hull area.
extent
float
required
Contour area divided by bounding rectangle area.

DetectedObject

Pydantic BaseModel. A detected object with its source metadata and extracted features.
image_name
str
required
Filename of the source image.
contour_index
int
required
Zero-based index of this contour within the image. Must be ≥ 0.
features
ContourFeatures
required
Extracted geometric features.

DetectionResult

Dataclass. Result of detection on a single image.
image_name
str
required
Name of the processed image.
contours
list[NDArray[Any]]
default:"[]"
Filtered OpenCV contour arrays that passed area constraints.
objects
list[DetectedObject]
default:"[]"
Detected objects with extracted features.
processing_steps
dict[str, NDArray[Any]]
default:"{}"
Intermediate processing images keyed by step name. Populated when ObjectDetector.save_intermediate is True.
count
int
Read-only property. len(objects).
DetectionResult.to_feature_rows() returns a list[dict[str, float | int | str]] suitable for building a pandas DataFrame or writing a CSV.

BatchDetectionResult

Dataclass. Aggregated results from processing a directory of images.
results
list[DetectionResult]
default:"[]"
One DetectionResult per successfully processed image.
total_objects
int
Read-only property. Sum of all detected objects across every image.
image_count
int
Read-only property. Number of images in results.
BatchDetectionResult.to_feature_rows() returns a combined list[dict[str, float | int | str]] from all images.

Clustering models

ClusterInfo

Pydantic BaseModel. Summary statistics for a single cluster.
cluster_id
int
required
Zero-based cluster identifier. Must be ≥ 0.
size
int
required
Number of objects assigned to this cluster. Must be ≥ 0.
centroid_x
float
required
Mean X coordinate of objects in this cluster.
centroid_y
float
required
Mean Y coordinate of objects in this cluster.
mean_area
float
required
Average area (pixels) of objects in this cluster. Must be ≥ 0.
mean_perimeter
float
required
Average perimeter (pixels) of objects in this cluster. Must be ≥ 0.

ElbowResult

Pydantic BaseModel. Result of the elbow method K-selection.
k_values
list[int]
required
K values that were evaluated (typically [1, 2, ..., max_k]).
inertias
list[float]
required
Within-cluster sum of squares (WCSS / inertia) for each K value.
optimal_k
int
required
The K at the elbow point. Must be ≥ 1.

SilhouetteResult

Pydantic BaseModel. Result of silhouette score analysis. Silhouette coefficient ranges from -1 (poor) to 1 (perfect).
k_values
list[int]
required
K values that were evaluated.
silhouette_scores
list[float | None]
required
Silhouette coefficient for each K. None for K=1 (undefined) and for any K exceeding the sample count.
optimal_k
int | None
default:"None"
K with the maximum silhouette score. None if computation was not possible. Must be ≥ 2 when set.

ClusteringResult

Dataclass. Result of K-Means clustering on a single image.
image_name
str
required
Name of the source image.
optimal_k
int
required
Optimal number of clusters determined by the elbow method.
labels
list[int]
default:"[]"
Cluster assignment for each object row in the input DataFrame.
clusters
list[ClusterInfo]
default:"[]"
Summary statistics for each cluster.
elbow_result
ElbowResult | None
default:"None"
Output from the elbow method analysis.
silhouette_result
SilhouetteResult | None
default:"None"
Output from silhouette score analysis. None when ClusteringConfig.compute_silhouette is False.
cluster_count
int
Read-only property. len(clusters).

BatchClusteringResult

Dataclass. Aggregated results from clustering all images in a features CSV.
results
list[ClusteringResult]
default:"[]"
One ClusteringResult per image that had enough samples to cluster.
image_count
int
Read-only property. Number of images in results.
total_clusters
int
Read-only property. Sum of cluster_count across all results.
BatchClusteringResult.get_result(image_name: str) -> ClusteringResult | None finds the result for a specific image by name.

Performance models

StageMetrics

Pydantic BaseModel. Performance metrics for a single pipeline stage (detection, clustering, or analysis).
stage_name
str
required
Human-readable stage identifier (e.g. "detection").
duration_seconds
float
required
Wall-clock time in seconds with millisecond precision.
memory_peak_mb
float
required
Peak memory delta in megabytes (process RSS).
memory_before_mb
float
default:"0.0"
Process memory before the stage started (MB).
memory_after_mb
float
default:"0.0"
Process memory after the stage completed (MB).
cpu_percent
float
default:"0.0"
Average CPU usage during stage execution (0–100 per core).
timestamp
datetime
default:"datetime.now()"
When the measurement was taken.

PerformanceSummary

Pydantic BaseModel. Aggregated performance metrics for an entire pipeline run.
from archeo_cluster.models import PerformanceSummary

summary = PerformanceSummary(
    detection=detection_metrics,
    clustering=clustering_metrics,
    analysis=analysis_metrics,
    total_duration_seconds=12.4,
    peak_memory_mb=184.0,
)
print(summary.is_complete())  # True
detection
StageMetrics | None
default:"None"
Metrics for the object detection stage.
clustering
StageMetrics | None
default:"None"
Metrics for the K-Means clustering stage.
analysis
StageMetrics | None
default:"None"
Metrics for the spatial analysis stage.
total_duration_seconds
float
default:"0.0"
Sum of all stage durations in seconds.
peak_memory_mb
float
default:"0.0"
Maximum memory seen across all stages in megabytes.
PerformanceSummary.is_complete() -> bool returns True when all three stage metrics (detection, clustering, analysis) are not None.