Skip to main content
The analyze command reads a _clustered.csv file produced by cluster and computes the Average Nearest Neighbor (ANN) index for each cluster to characterize how the detected objects are spatially distributed.

Syntax

./run-cli analyze --input <path> [options]

Options

--input
string
required
Path to the clustered CSV file produced by the cluster command. Short alias: -i.This file must contain a cluster column in addition to the feature columns.
--output-dir
string
default:"<input parent>/analysis"
Directory where output plots and statistics are written. Short alias: -o.Defaults to an analysis/ subdirectory next to the input CSV file.

ANN index explained

The Average Nearest Neighbor (ANN) index is the ratio of the observed mean nearest-neighbor distance to the expected mean distance under a random distribution:
R-index valueInterpretation
R < 0.9Clustered — objects are closer together than expected by chance
0.9 ≤ R ≤ 1.1Random — no significant spatial pattern
R > 1.1Dispersed — objects are more spread out than expected by chance
The index is computed independently for each cluster group, so you can compare spatial patterns across different artifact types in the same image.

Output files

<output-dir>/
├── descriptive_stats.csv        ← mean, std, min, max for area/perimeter per cluster
├── ann_results.csv              ← R-index and interpretation per cluster
├── ann_results.png              ← bar chart of R-index values by cluster
├── spatial_distribution_map.png ← scatter map of objects coloured by cluster
└── boxplot_<feature>.png        ← one boxplot per morphological feature (area, perimeter, etc.)

Examples

# Analyze a single clustered CSV with defaults
./run-cli analyze -i ./results/clusters/site_a/site_a_clustered.csv

# Write plots to a specific directory
./run-cli analyze \
  -i ./results/clusters/site_a/site_a_clustered.csv \
  -o ./results/analysis/site_a
When using the pipeline command, spatial analysis runs automatically on all images and writes results to the analysis/ subdirectory of the session. Use analyze directly only when you need to rerun this stage on existing clustered data.