How HSV color segmentation works
OpenCV’s HSV color space separates hue (color type) from saturation (color intensity) and value (brightness). Working in HSV makes it much easier to write tolerant color ranges than in BGR, because lighting changes mostly affect S and V while H stays stable. The detection pipeline runs these steps in order:- Convert image from BGR to HSV with
cv2.cvtColor(image, cv2.COLOR_BGR2HSV) - Build a binary mask with
cv2.inRange(hsv, lower_bound, upper_bound) - Apply
MORPH_CLOSEto fill small holes inside artifacts - Apply
MORPH_OPENto remove small noise specks - Find external contours with
cv2.findContours(..., cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) - Filter contours by pixel area
- Extract geometric features from each surviving contour
Choosing the right target color
The--color flag (or target_color in config) accepts a hex color code. The utility functions in archeo_cluster.utils.color handle the conversion chain:
generate_color_range function clips all values to valid HSV ranges (H: 0–179, S/V: 0–255):
Detection parameters
target_color (default: "#A98876")
Hex code of the color you want to isolate. The default is a warm terracotta representative of ceramic fragments.
hue_offset (default: 10, range: 0–90)
Expands the mask ±offset around the base hue. A value of 10 captures about an 11° band either side of the target hue. Increase it for naturally varying surfaces; decrease it when two artifact classes have similar colors.
saturation_offset (default: 50, range: 0–127)
Expands the mask up and down the saturation axis. Higher values capture both washed-out and vivid instances of the color.
value_offset (default: 50, range: 0–127)
Expands the mask along the brightness axis. Raise it to handle shadows and highlights on 3-D objects.
min_area and max_area (defaults: 50 / 5000, units: pixels²)
Contours outside this range are discarded after masking. filter_contours implements the check:
Area values are in pixel² relative to the resolution of the input image. If you resize images before processing, scale these thresholds accordingly.
How morphological operations affect results
Two operations are applied to the mask in sequence, both using thekernel_size (default (5, 5)):
| Operation | OpenCV call | Effect |
|---|---|---|
MORPH_CLOSE | cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel) | Fills small dark holes inside bright regions — useful for artifacts with surface texture |
MORPH_OPEN | cv2.morphologyEx(mask_closed, cv2.MORPH_OPEN, kernel) | Removes small isolated bright specks — reduces detection of soil grains or dust |
Tips for different artifact types
Ceramics and pottery
Ceramics and pottery
Ceramics typically have a warm, muted hue (terracotta to buff). Start with
hue_offset=10, saturation_offset=50. Use moderate min_area (50–200 px²) if photographing sherds close-up; increase to 200–500 px² for wide-angle field shots to filter soil pebbles.Stone artifacts
Stone artifacts
Flint, obsidian, and limestone have low saturation. Increase
saturation_offset to 80–100 so the mask captures grayish tones. Widen value_offset to 60–70 to handle the wide brightness range of stone surfaces.Bone fragments
Bone fragments
Bone tends toward pale yellow or cream. Use
hue_offset=15 to catch the natural yellowing variation. Bone fragments can be large, so raise max_area to 20 000 px² or higher depending on image resolution.Feature extraction
For every contour that survives the area filter,extract_features computes eight geometric properties:
| Feature | Formula | Interpretation |
|---|---|---|
area | cv2.contourArea | Pixel² size of detected object |
perimeter | cv2.arcLength(..., closed=True) | Boundary length in pixels |
centroid_x, centroid_y | Image moments m10/m00, m01/m00 | Spatial position (used for spatial analysis, not clustering) |
circularity | 4π·area / perimeter² | 1.0 = perfect circle; lower = more irregular |
aspect_ratio | width / height | Elongation of the bounding rectangle |
solidity | area / convex_hull_area | 1.0 = fully convex; lower = concave or fragmented |
extent | area / bounding_rect_area | Ratio of object to its bounding box |
area, perimeter, circularity, aspect_ratio, solidity, extent) are the ones used for K-Means clustering in the next stage.