A Practical Guide to Semi-Landmark Alignment Methods for Outline-Based Identification in Biomedical Research

Christian Bailey Dec 02, 2025 114

This article provides a comprehensive resource for researchers and professionals applying outline-based geometric morphometrics in identification tasks, such as in taxonomic classification or morphological phenotyping.

A Practical Guide to Semi-Landmark Alignment Methods for Outline-Based Identification in Biomedical Research

Abstract

This article provides a comprehensive resource for researchers and professionals applying outline-based geometric morphometrics in identification tasks, such as in taxonomic classification or morphological phenotyping. It covers the foundational principles of sliding semi-landmarks, details major methodological approaches including their algorithmic basis, and offers guidance for optimizing parameters and troubleshooting common issues. Furthermore, it presents a framework for the validation and comparative assessment of different methods, synthesizing recent findings to guide robust and reproducible shape analysis in biomedical and evolutionary studies.

Understanding Semi-Landmarks: From Basic Concepts to Solving the Landmark Sparsity Problem

The Challenge of Landmark Sparsity in Complex Biological Structures

The quantitative assessment of morphological variation relies on the ability to locate homologous points, known as landmarks, across biological structures. Gold-standard methods traditionally depend on expert manual placement of landmarks at 'biologically homologous' locations [1]. However, the shape information captured by these anatomical landmarks is inherently limited by their sparse distribution, often resulting in an incomplete representation of complex anatomy. This challenge is particularly acute in regions with smooth surfaces, poorly defined tissue boundaries, or significant morphological variation across specimens, where traditional landmark analysis fails to capture biologically relevant variability [1] [2].

Landmark sparsity presents a fundamental constraint in geometric morphometrics, especially with the increasing availability of high-resolution three-dimensional (3D) imaging data from computed tomography (CT) and surface scanning technologies [2]. These rich datasets contain vast amounts of phenotypic information that sparse landmarks cannot adequately capture. Structures such as cranial vaults, limb bones, and curved surfaces often lack discrete points for reliable landmark identification, leaving significant morphological information unsampled [2]. This limitation becomes increasingly problematic when studying subtle variations within species or major morphological differences across broad taxonomic groups, where the loss of morphological information can hinder evolutionary and developmental analyses.

Semi-Landmarks as a Solution

Conceptual Foundation

Semi-landmarks have been developed to supplement the information provided by traditional manual landmarks by relaxing the requirement for strict biological homology [1]. These points are placed along curves and surfaces between traditional landmarks to capture shape information that would otherwise be inaccessible [1] [2]. While they do not guarantee the biological correspondence of traditional landmarks, semi-landmarks provide a powerful tool for quantifying complex biological forms by densely sampling regions between landmarks.

The methodological spectrum ranges from semi-landmarks, which maintain some biological correspondence through sliding algorithms, to pseudolandmarks, which are placed automatically on image surfaces with no direct relationship to manually placed landmarks [1] [2]. Pseudolandmark methods, such as auto3dgm, transform surface meshes into point clouds subjected to Procrustes superimposition, removing subjectivity in placement and significantly reducing data collection time [2]. However, this approach limits the ability to link patterns of variance to specific biological mechanisms or developmental tissues.

Technical Approaches and Their Trade-offs

Several computational strategies have been developed for semi-landmark placement, each with distinct advantages and limitations. The patch-based approach projects semi-landmarks to a mesh surface from triangular patches constructed from manual landmark points, preserving the geometric relationship between semi-landmarks and manual landmarks [1]. The patch-TPS method generates semi-landmarks on a single template mesh and transfers them to each specimen using a thin-plate spline (TPS) transform followed by projection along template surface normal vectors [1]. Pseudolandmark sampling generates points regularly sampled at arbitrary locations on a template model and projects them to each sample using a TPS transform [1].

Each method presents trade-offs in correspondence of points across images, point spacing regularity, sample coverage, repeatability, and computational time [1]. The patch method demonstrates sensitivity to noise and missing data, potentially resulting in outliers with large deviations in mean shape estimates. In contrast, patch-TPS and pseudolandmark approaches provide more robust performance with noisy or variable datasets [1].

Table 1: Comparison of Semi-Landmark Sampling Strategies

Method	Correspondence	Noise Robustness	Coverage	Computational Demand	Primary Application
Patch-based	High (geometric relationship to manual landmarks)	Low	Dependent on manual landmark placement	Moderate	Single specimen analysis
Patch-TPS	Moderate (template-based)	High	Consistent across samples	High	Multi-specimen datasets
Pseudolandmark	Low (automatic placement)	High	Extensive and uniform	High	Large-scale comparative studies
Template-dependent	Moderate (algorithm-based template)	Moderate	Defined by template	Moderate	Standardized curve analysis

Quantitative Comparison of Method Performance

Evaluation Metrics and Experimental Framework

To evaluate the efficacy of different dense sampling strategies, researchers have implemented comparative studies using standardized metrics. One key approach quantifies the success of a transform between an individual specimen and a population average template by measuring the average mean root squared error between the transformed mesh and the template [1]. This metric assesses how well each semi-landmark set captures the essential shape characteristics while minimizing distortion.

Studies typically employ datasets with known morphological variation, such as great ape crania from multiple species (Pan troglodytes, Gorilla gorilla, and Pongo pygmaeus), to test methods across significant shape diversity [1]. The landmark sets generated by each method are used to estimate a transform to a template, with performance quantified through shape estimation accuracy. Experimental protocols often include sensitivity analyses testing robustness to noise, missing data, and morphological variability [1].

Performance Outcomes

Research findings indicate that all three major semi-landmark strategies (patch, patch-TPS, and pseudolandmark sampling) can produce shape estimations of population average templates that are comparable to or exceed the accuracy of using manual landmarks alone, while dramatically increasing the density of shape information [1]. The patch-TPS method demonstrates particular strength in handling dataset variability, while the basic patch approach shows greater sensitivity to noise and missing data, sometimes resulting in outliers with large deviations [1].

Table 2: Quantitative Performance of Semi-Landmark Methods

Method	Mean Shape Estimation Accuracy	Sensitivity to Noise	Robustness to Missing Data	Shape Information Capture
Manual Landmarks Only	Baseline	Low	High	Limited (sparse coverage)
Patch-based	Comparable or superior to manual	High	Low	Moderate (landmark-dependent)
Patch-TPS	Comparable or superior to manual	Low	High	High (consistent across samples)
Pseudolandmark	Comparable or superior to manual	Low	Moderate	Very high (dense coverage)
Template-dependent	Comparable to other methods	Moderate	Moderate	High (curve-focused)

Advanced computational approaches have further enhanced semi-landmark methodologies. The CAMPA (Conditional Autoencoder for Multiplexed Pixel Analysis) framework uses deep learning to identify consistent subcellular landmarks across experimental perturbations by learning condition-independent representations of molecular pixel profiles [3]. This approach enables quantitative comparison of subcellular organization despite condition-dependent changes in protein localization, demonstrating the potential of machine learning in addressing landmark consistency challenges.

Detailed Experimental Protocols

Patch-Based Semi-Landmark Protocol

Materials and Software Requirements:

3D Slicer with SlicerMorph extension [1]
High-resolution 3D surface meshes
Pre-placed manual anatomical landmarks

Methodological Workflow:

Patch Definition: Select three manually digitized landmarks to define the boundaries of triangular regions on the specimen surface.
Grid Registration: Register a template triangular grid with user-specified semi-landmark density to the vertices of the bounding triangle using thin-plate spline deformation.
Surface Projection:
- Smooth surface normal vectors using Laplacian smoothing to reduce noise impact
- Estimate model surface orientation underlying each triangular patch by averaging surface normal vectors at the three defining landmarks
- For each sampling point, cast a ray in the direction of the projection vector, constrained by average distance between triangle vertices
- Select final intersection between ray and surface mesh as projected point
Point Set Merging: Identify unique triangle edges, place uniformly sampled lines between endpoints, project to image surface, and merge with manual landmarks.

Template-Dependent Semi-Landmark Protocol for Curves

Materials and Software Requirements:

CLIC package or XYOM software [4]
2D or 3D digital images of biological structures
Defined endpoints for curves of interest

Methodological Workflow:

Template Construction: Generate evenly spaced lines perpendicular to an interlandmark connecting line between successive landmarks.
Semilandmark Acquisition: Calculate intersections of perpendicular template lines with the biological curve of interest.
Consensus Template Alignment:
- Perform Procrustes superposition of all points across specimens
- Compute consensus template from aligned specimens
- Project semilandmarks to consensus template using perpendicular projection
Shape Analysis: Conduct principal component analysis and other morphometric analyses on aligned landmark and semilandmark coordinates.

This template-dependent approach has been successfully applied in medical entomology for wing venation analysis in Glossina species (tsetse flies) and Triatominae, as well as for egg shape analysis in Triatominae [4]. The method produces shape distortion comparable to or lower than alternative sliding techniques while providing standardized landmark acquisition.

Research Reagent Solutions

Table 3: Essential Research Reagents and Tools for Semi-Landmark Analysis

Item	Function/Application	Examples/Specifications
3D Slicer with SlicerMorph	Open-source platform for 3D visualization and morphometric analysis	Provides implementations for patch, patch-TPS, and pseudolandmark methods [1]
CLIC Package	Software for geometric morphometrics with template-based semi-landmarks	Enables template-dependent semi-landmark acquisition and alignment [4]
R Packages (Morpho, geomorph)	Statistical analysis of landmark data	Implements sliding algorithms and statistical shape analysis [1]
High-Resolution Imaging Systems	Generation of 3D specimen reconstructions	CT scanners, surface scanners for digital specimen creation [2]
CAMPA Framework	Deep learning for consistent subcellular landmarks	Identifies consistent landmarks across experimental perturbations [3]

Visualization of Method Workflows

Semi-Landmark Method Selection Algorithm

Patch-Based Semi-Landmark Implementation Workflow

The challenge of landmark sparsity in complex biological structures represents a significant constraint in geometric morphometrics that can be effectively addressed through semi-landmark methodologies. By implementing patch-based, template-TPS, pseudolandmark, or template-dependent approaches, researchers can dramatically increase the density of shape information captured from biological specimens while maintaining reasonable biological correspondence. The quantitative performance of these methods demonstrates their capacity to produce shape estimations comparable or superior to manual landmarks alone, with particular strengths in handling complex curves and surfaces.

As 3D imaging technologies continue to advance and generate increasingly rich morphological datasets, semi-landmark methods will play an essential role in extracting meaningful biological information from complex structures. The integration of machine learning approaches, such as the CAMPA framework, further enhances our ability to identify consistent landmarks across experimental perturbations. By selecting appropriate semi-landmark strategies based on specific research questions, structural complexity, and dataset characteristics, researchers can overcome the limitations of landmark sparsity and advance our understanding of morphological variation across biological systems.

Geometric morphometrics, the statistical analysis of biological shape based on landmark coordinates, has revolutionized the study of phenotypic evolution. However, a significant limitation of traditional landmarks is their sparse distribution and inability to capture information from smooth curves and surfaces lacking discrete anatomical points. Semilandmarks were developed to address this limitation, enabling the quantification of homologous curves and surfaces between traditional landmarks [2] [5]. These points are not landmarks in the strict sense of developmental homology but are essential for capturing biologically meaningful shape variation along outlines and surfaces that would otherwise be missed [6] [2].

The application of sliding semilandmark techniques has become increasingly critical with the proliferation of high-resolution three-dimensional (3D) imaging data from CT and surface scanners. These datasets provide rich morphological information that traditional landmarks cannot fully exploit [2] [1]. By allowing points to slide along curves or surfaces to minimize bending energy or Procrustes distance, semilandmarks facilitate the quantification of complex morphological structures across diverse taxa, from fish fins to primate crania [2] [7]. This protocol details the theoretical foundations and practical application of semilandmarks within the context of outline-based identification research.

Theoretical Foundations

The Concept of Geometrical Homology

In geometric morphometrics, landmarks are defined by their biological homology—the property of representing the same anatomical position across specimens. Semilandmarks operate under a different principle, that of geometric homology, where entire curves or surface patches are considered homologous structures, even if individual points along them are not [8]. They capture the shape of these structures between traditional Type I, II, and III landmarks [4].

The fundamental challenge is that semilandmarks outnumber traditional landmarks in most configurations and their initial positions along curves or on surfaces contain arbitrary variation tangent to the shape. The solution is the sliding process, which optimizes their positions to remove this arbitrariness, establishing geometric correspondence that reflects the overall shape of the curve or surface [5] [8].

Sliding Algorithms: Bending Energy vs. Procrustes Distance

Two primary algorithms are used for sliding semilandmarks, each with distinct properties and applications:

Minimization of Bending Energy: This approach slides semilandmarks to minimize the bending energy of the thin-plate spline (TPS) interpolation between specimens. It gives greater weight to landmarks and semilandmarks local to the point being slid and is particularly effective for capturing localized shape changes [7] [8].
Minimization of Procrustes Distance: This method slides semilandmarks to minimize the squared Procrustes distance between specimens. It considers the global configuration, meaning all landmarks and semilandmarks influence the sliding of each point [7] [4].

Table 1: Comparison of Semilandmark Sliding Algorithms

Algorithm	Optimization Criterion	Spatial Influence	Best Application Context
Bending Energy	Energy of TPS deformation	Local	Capturing localized shape variation; studies of modularity
Procrustes Distance	Sum of squared distances between corresponding points	Global	Overall shape correspondence; datasets with globally integrated structures

The choice between algorithms can influence analytical outcomes. While both methods generally produce consistent results for the overall shape of curves and surfaces, studies indicate that the bending energy approach is more sensitive to localized shape differences [7] [4].

Protocols for Semilandmark Data Collection

The process of digitizing and analyzing semilandmarks varies significantly between 2D curves and 3D surfaces. The following protocols provide generalized workflows for these two scenarios.

Protocol 1: Capturing 2D Curves and Outlines

This protocol is suitable for analyzing wing venation in insects [9], leaf outlines in plants, or other 2D profile shapes.

Diagram 1: Workflow for 2D curve semilandmark acquisition

Step-by-Step Procedure:

Define Fixed Landmarks: Identify and digitize traditional Type I or II landmarks at biologically homologous positions that define the endpoints of the curve of interest. For example, in a fly wing, these might be landmarks at the junctions of major veins [9] [4].
Digitize the Curve: Manually trace a dense series of points along the curve connecting the fixed landmarks using software such as tpsDig or ImageJ.
Create Template and Place Semilandmarks:
- Between two successive fixed landmarks, the software generates an algorithm-based template consisting of evenly spaced lines perpendicular to the line connecting the landmarks [4].
- The intersections of these perpendicular lines with the digitized curve define the initial positions of the semilandmarks.
Template Application: Apply this template to all specimens in the dataset. The semilandmarks for each specimen are collected at the intersections of the template's perpendiculars with the specimen's curve, ensuring comparable geometric positions across specimens [4].
Sliding Process: In software such as geomorph for R, execute the sliding algorithm (e.g., gpagen function) to minimize either bending energy or Procrustes distance. This establishes geometric correspondence [6] [8].
Procrustes Superimposition: Perform Generalized Procrustes Analysis (GPA) to align all specimens—including both fixed landmarks and slid semilandmarks—into a common shape space by removing the effects of position, orientation, and scale [6].
Statistical Analysis: Conduct downstream analyses such as Principal Component Analysis (PCA), discriminant analysis, or regression on the Procrustes coordinates to explore shape variation.

Protocol 2: Placing 3D Surface Semilandmarks

This protocol is essential for quantifying complex 3D structures like mammalian crania, which have extensive smooth surfaces between traditional landmarks [2] [1].

Diagram 2: Workflow for 3D surface semilandmark acquisition using a template

Step-by-Step Procedure:

Template Creation:
- Select a representative specimen with high-quality 3D mesh (e.g., from CT or laser scanning).
- Digitize all traditional landmarks onto this template mesh [1].
Patch Definition and Initial Semilandmark Placement:
- Define triangular patches on the template surface using triplets of fixed landmarks that bound the region of interest [1].
- For each patch, create a grid of points. Project these points onto the template mesh surface using ray-casting algorithms along the average surface normal vector of the bounding landmarks [1].
Template-Based Semilandmark Transfer:
- For each target specimen, perform a thin-plate spline (TPS) warp to deform the template mesh (with its landmarks and semilandmarks) to fit the landmark configuration of the target specimen [1].
- Transfer the semilandmarks from the warped template to the actual surface of the target specimen by projecting them along the normal vectors of the warped template onto the target mesh [1].
Sliding and Alignment: Slide the semilandmarks on the target specimen's surface to minimize bending energy or Procrustes distance, typically performed iteratively during the Procrustes alignment process [8].
Procrustes Superimposition and Analysis: Perform GPA on the combined set of fixed landmarks and slid semilandmarks from all specimens, enabling subsequent statistical analysis of shape variation.

The Scientist's Toolkit

Table 2: Essential Software and Tools for Semilandmark Analysis

Tool Name	Function	Application Context
R package `geomorph`	GPA, sliding semilandmarks, statistical shape analysis	Primary tool for statistical processing and analysis of landmark & semilandmark data [6]
3D Slicer / SlicerMorph	3D visualization, landmark digitization, patch-based semilandmarking	Collection of 3D landmark data; application of patch-based semilandmarks directly on specimens [1]
Morpho (R package)	Sliding semilandmarks, surface processing, missing data estimation	Alternative R package for processing semilandmarks and working with 3D surfaces [5] [10]
CLIC/XYOM package	Template-based semilandmark collection and alignment	Specialized for outline analysis using template-dependent perpendicular projection methods [4]
TPS series (tpsDig2, tpsRelw)	2D landmark and curve digitization	Digitizing landmarks and outlines from 2D images; preliminary relative warp analysis [10]

Critical Considerations and Methodological Comparisons

Template Selection and Impact on Analysis

The choice of template significantly influences semilandmark placement, especially for 3D surfaces. An ideal template should represent the average shape of the sample or have high geometric similarity to all specimens. Poor template selection can lead to projection errors where semilandmarks are placed on incorrect anatomical features, particularly in morphologically disparate datasets [1] [11].

Comparison of Semilandmarking Approaches

Different methodologies for establishing point correspondences yield varying results, highlighting the importance of method selection based on research goals.

Table 3: Quantitative Comparison of Semilandmarking Approaches Based on Ape Cranial Data [1]

Method	Correspondence Quality	Sensitivity to Noise	Computational Demand	Point Spacing
Patch-based	High (geometrically defined)	High (outliers occur)	Low	Regular within patches
Patch-TPS	High	Low (robust)	Medium	Regular
Pseudo-landmark	Variable (no geometric relation)	Low	Medium-High	Regular across surface
Landmark-free (DAA)	Variable (sample-dependent)	Medium	High	Irregular, density varies

Challenges and Limitations

Despite their utility, semilandmarks present several challenges:

Homology Uncertainty: Semilandmarks represent approximations of geometric correspondence rather than true biological homology, which should be considered when interpreting results [7] [8].
Method-Dependent Results: Different sliding algorithms and placement strategies can produce different statistical outcomes, particularly for studies of modularity and integration [7] [8].
Density Considerations: While higher semilandmark density captures more shape detail, it increases computational demands and risks overparameterization. The optimal density sufficiently captures the curvature of the surface without redundant points [8].

Applications in Identification Research

Outline-based geometric morphometrics using semilandmarks has proven valuable for species identification where traditional morphological characters are limited. For example, analysis of wing cell contours using outline-based methods successfully distinguished three morphologically similar Tabanus species (horse flies), with the first submarginal cell contour providing the highest classification accuracy (86.67%) [9]. This approach is particularly valuable for damaged specimens where only portions of wings remain intact, offering a viable alternative to traditional identification methods.

In medical entomology, semilandmark approaches have been applied to the wings of Glossina (tsetse flies) and Triatominae (kissing bugs), as well as to eggs of Triatominae, enabling precise discrimination of vector species critical for disease control programs [4]. The template-based method ensures consistency and repeatability across studies and operators, enhancing the reliability of identification protocols.

In biological research, homology refers to the similarity between structures due to shared ancestry, where features are derived from a common ancestor regardless of potential differences in their current function or form [12]. This foundational concept provides the basis for comparative biology and taxonomic classification. In modern morphometrics—the quantitative analysis of biological form—the practical application of homology faces significant challenges, particularly when dealing with complex anatomical surfaces that lack discrete, identifiable anatomical points. This has led to an important distinction between two approaches to defining equivalence in biological structures: developmental homology and geometric homology.

Developmental homology is established through historical and embryological continuity, where structures are considered homologous if they originate from the same embryonic precursors or share an evolutionary lineage [12]. In contrast, geometric homology (often referred to in morphometric literature as "semi-landmarks") is defined primarily by spatial correspondence and algorithmic placement on biological surfaces between traditional landmarks, enabling the quantification of shape variation in regions lacking clearly identifiable anatomical landmarks [7] [4]. This application note explores the core principles, methodological approaches, and practical applications of these complementary concepts within the context of semi-landmark alignment methods for outline-based identification research.

Conceptual Foundations: Contrasting Developmental and Geometric Homology

Developmental Homology: The Biological Standard

Developmental homology represents the classical biological understanding of equivalence between structures. The concept was first formally applied in biology by anatomist Richard Owen in 1843, who defined a homologous structure as the "same organ in different animals under every variety of form and function" [12]. This perspective was later explained by Charles Darwin's theory of evolution as structures retained from a common ancestor. A classic example includes the forelimbs of vertebrates, where the wings of bats, arms of primates, and flippers of whales all derive from the same ancestral tetrapod structure despite their divergent functions [12].

Core Principles of Developmental Homology:

Historical Continuity: Features are homologous if they can be traced through evolutionary descent with modification from a corresponding structure in a common ancestor.
Embryological Origin: Structures develop from the same embryonic tissues or primordia across different taxa.
Positional Correspondence: Homologous structures occupy equivalent positions within a body plan and maintain consistent topological relationships with surrounding structures.
Genetic Basis: Shared genetic or developmental genetic mechanisms underlie the formation of homologous structures, even when modified over evolutionary time.

Geometric Homology: The Morphometric Solution

Geometric homology emerged as a practical solution to a fundamental problem in geometric morphometrics: many biologically important surfaces and curves lack sufficient traditional landmarks for comprehensive shape analysis. Semi-landmarks (also called sliding landmarks) are points having poor homology in the developmental sense but essential for capturing the geometry of curves or surfaces where definitive landmarks are sparse [4]. Unlike developmental homologs, these points are not necessarily equivalent in an evolutionary or developmental sense but rather represent mathematically defined correspondences that allow researchers to quantify and compare form across specimens.

Core Principles of Geometric Homology:

Spatial Correspondence: Points are considered equivalent based on their relative positions along curves or surfaces between established landmarks.
Algorithmic Determination: Semi-landmark locations are determined by mathematical procedures that optimize correspondence between specimens.
Template Dependency: Point locations often depend on a reference template or specimen that is warped to match target specimens.
Statistical Equivalence: The primary requirement is that points capture comparable geometric information across specimens for statistical shape analysis.

Table 1: Fundamental Distinctions Between Developmental and Geometric Homology

Aspect	Developmental Homology	Geometric Homology
Basis of Equivalence	Evolutionary descent and embryonic origin	Spatial correspondence and mathematical optimization
Primary Evidence	Fossil records, embryological development, genetic mechanisms	Geometric position relative to landmarks and contours
Point Identifiability	Anatomically defined and readily identifiable	Algorithmically determined between landmarks
Stability	Consistent across evolutionary time	Dependent on landmark configuration and analysis method
Application Scope	Phylogenetic studies, evolutionary biology	Morphometric analyses of complex surfaces

Methodological Approaches: Semi-Landmark Alignment Techniques

Landmark Typology in Morphometrics

In geometric morphometrics, landmarks are classified based on their biological definitiveness [4]:

Type I Landmarks: Discrete anatomical points defined by local tissue composition (e.g., juxtaposition of tissues), representing the strongest form of developmental homology.
Type II Landmarks: Points defined by geometric properties of larger structures (e.g., point of maximum curvature along a boundary).
Type III Landmarks: Points defined relative to the position of other landmarks, having the weakest claim to developmental homology.

Semi-landmarks extend this typology to capture outlines and surfaces, conceptually similar to Type III landmarks but specifically designed to represent curves and surfaces collectively rather than as individual points [4].

Semi-Landmark Alignment Algorithms

Multiple computational approaches have been developed to establish geometric homology through semi-landmark placement. Recent comparative studies have evaluated three primary landmark-driven approaches [7] [8]:

Sliding Thin-Plate Splines (TPS): This method slides semi-landmarks along tangents to curves or surfaces to minimize the bending energy required to deform the reference shape to each target specimen. Bending energy minimization implicitly assumes that the biological transformation between forms occurs as smoothly as possible [13].
Minimum Procrustes Distance (D): This approach slides semi-landmarks to minimize the Procrustes distance between the reference and target specimens by projecting points along directions perpendicular to the curve or surface [13].
Hybrid Methods (TPS&NICP): This combines thin-plate spline warping with non-rigid iterative closest point (NICP) algorithms, using TPS for initial non-rigid registration followed by NICP to further refine surface correspondence [8].

Table 2: Comparison of Semi-Landmark Alignment Methodologies

Method	Theoretical Basis	Advantages	Limitations
Sliding TPS (Bending Energy)	Minimizes energy required for deformation	Produces smooth deformations; biologically plausible transformations	Sensitive to initial reference; may oversmooth sharp features
Minimum Procrustes Distance	Minimizes Euclidean distance between corresponding points	Direct optimization of alignment criterion; mathematically straightforward	All points influence sliding equally, regardless of distance
Template-Based Projection	Projection along perpendiculars to template-defined lines	Consistent digitization; reduces operator bias	Template choice critically affects results; may lose biological correspondence
Hybrid (TPS&NICP)	Combines smooth deformation with local rigidity	Balances global and local correspondence; handles large deformations	Computationally intensive; multiple parameters to optimize

Impact of Method Selection on Analytical Results

The choice of semi-landmark alignment method significantly influences subsequent morphometric analyses. Research comparing these approaches has demonstrated that:

Statistical Consistency: Goodall's F-test results and classification accuracy are generally similar across methods, suggesting robustness for group discrimination tasks [13].
Shape Variation Patterns: Estimates of within-group and between-group variation (Foote's measurement) differ between bending energy and Procrustes distance criteria [13].
Principal Component Alignment: Low correlation exists between the first principal component axes obtained by different sliding methods, indicating that the major axes of shape variation are method-dependent [13].
Methodological Convergence: Non-rigid semilandmarking approaches (sliding TPS and TPS&NICP) yield more consistent results with each other than with rigid registration approaches [7].

These findings underscore that while semi-landmarks enable the quantification of shape in landmark-sparse regions, all subsequent statistical analyses are subject to error inherent in the semi-landmarking process, and results should be interpreted with appropriate caution [7].

Experimental Protocols: Implementing Semi-Landmark Methods

Standardized Protocol for Semi-Landmark Data Collection

Purpose: To establish consistent procedures for capturing geometric homology in outline-based identification research.

Materials and Equipment:

High-resolution 3D surface scans or CT data of specimens
Geometric morphometrics software (e.g., Viewbox, tpsDig, R geomorph package)
Template specimen representing average morphology

Procedure:

Landmark Definition: Identify and record Type I and Type II landmarks on all specimens based on clear anatomical criteria.
Template Specification: Select a reference specimen with average morphology and clear anatomical features.
Curve and Surface Definition: Define curves between landmarks and surface patches bounded by landmarks on the template specimen.
Semi-Landmark Distribution: Place semi-landmarks evenly along curves and surfaces:
- For 2D outlines: 20-50 points per outline, depending on complexity
- For 3D surfaces: 100-400 points per surface patch, ensuring even coverage
Template Transfer: Warp template to each target specimen using thin-plate spline interpolation based on fixed landmarks.
Semi-Landmark Projection: Project semi-landmarks from warped template to nearest points on target specimens.
Sliding Alignment: Apply sliding algorithm (bending energy or Procrustes distance minimization) to remove tangential positional noise.
Quality Control: Verify landmark correspondence across all specimens through visual inspection and Procrustes ANOVA.

Protocol Validation and Reliability Assessment

Purpose: To evaluate the repeatability and reproducibility of semi-landmark placement.

Procedure:

Intra-operator Repeatability: The same operator places landmarks and semi-landmarks twice on a subset of specimens (≥20) with a washout period between sessions.
Inter-operator Reproducibility: Multiple trained operators place landmarks on the same subset of specimens independently.
Statistical Analysis: Calculate Lin's Concordance Correlation Coefficient (CCC) between landmark configurations:
- Excellent reliability: CCC > 0.90
- Good reliability: CCC 0.80-0.90
- Acceptable reliability: CCC 0.70-0.79
Procrustes ANOVA: Partition variance components to assess measurement error relative to biological variation.

Table 3: Research Reagent Solutions for Semi-Landmark Studies

Tool/Category	Specific Examples	Function/Purpose
Imaging Modalities	CT scanning, micro-CT, laser surface scanning	Generate 3D digital representations of specimens
Segmentation Software	ITK-SNAP, Amira, Mimics	Extract 3D surface models from volumetric data
Landmarking Software	Viewbox, tpsDig, MorphoJ	Digitize landmarks and semi-landmarks on 3D models
Template Construction Tools	CAO tools in StarCCM+, MeshLab	Create and manipulate reference template specimens
Statistical Analysis Packages	R geomorph package, PAST, EVAN Toolbox	Perform Procrustes superimposition and shape statistics
Visualization Tools	R rgl library, Paraview, MeshLab	Visualize 3D shape variation and deformation

Application in Medical Research: Nasal Cavity Morphometry

Case Study: Personalized Nose-to-Brain Drug Delivery

A recent application in medical research demonstrates the practical utility of geometric homology principles. Researchers performed geometric morphometric analysis on the nasal cavity region of interest (ROI) for 151 unilateral nasal cavities from 78 patients to predict olfactory region accessibility for drug delivery [14].

Methodology:

Landmark Configuration: 10 fixed anatomical landmarks placed on homologous regions
Semi-Landmark Supplement: 200 semi-landmarks distributed across the ROI surface
Template-Based Placement: Semi-landmarks projected from template to individual specimens using thin-plate spline warping with bending energy minimization
Shape Analysis: Generalized Procrustes Analysis followed by Principal Component Analysis
Cluster Identification: Hierarchical Clustering on Principal Components to identify morphological variants

Findings:

Three distinct morphological clusters were identified with significant shape differences
Cluster 1 exhibited broader anterior cavity with shallower turbinate onset, potentially improving olfactory accessibility
31.5% of patients had at least one nasal cavity in this favorable morphology
The method demonstrated good to excellent repeatability (CCC > 0.80)

This application illustrates how geometric homology principles, implemented through semi-landmark methods, can stratify patient populations for personalized medical interventions based on anatomical shape variation.

The distinction between developmental and geometric homology represents a fundamental theoretical division with significant practical implications for morphometric research. Developmental homology provides the biological foundation for comparative studies, ensuring that comparisons are evolutionarily meaningful. Geometric homology, implemented through semi-landmark methods, provides the analytical tools to quantify shape variation across entire biological structures, not just at discrete landmark points.

For outline-based identification research, the integration of both concepts is essential:

Developmental Homology guides the initial placement of fixed landmarks, ensuring biological relevance.
Geometric Homology enables comprehensive shape capture between these landmarks, providing statistical power.

Current research indicates that while different semi-landmark approaches yield somewhat different results, non-rigid methods (sliding TPS and TPS&NICP) show the greatest consistency, particularly when landmarks provide good coverage of the morphological structure [8]. All semi-landmarking methods estimate homology with some degree of error, and researchers should acknowledge this limitation in their interpretations.

As geometric morphometrics continues to advance, particularly in medical applications such as personalized drug delivery [14], the thoughtful integration of both developmental and geometric concepts of homology will remain essential for balancing biological meaning with mathematical practicality in outline-based identification research.

In geometric morphometrics, the analysis of biological shapes often extends beyond traditional landmarks to include semi-landmarks: points placed on curves and surfaces to capture the geometry of morphological structures lacking discrete anatomical landmarks [1]. These semi-landmarks require a sliding process to establish geometric correspondence across specimens by minimizing a specific criterion—typically either bending energy or Procrustes distance [15]. This alignment process is fundamental for outline-based identification research across biological and medical disciplines, particularly in pharmaceutical development where precise morphological characterization can influence drug delivery systems and anatomical targeting [16].

The fundamental challenge addressed by sliding semi-landmarks stems from their initial non-homologous placement. Unlike traditional landmarks identified through biological homology, semi-landmarks are often placed algorithmically between landmarks or along curves and surfaces [15]. The sliding process refines their positions to establish geometrical correspondence, thereby enabling meaningful statistical shape analysis. Within the context of a broader thesis on semi-landmark alignment methods, understanding the distinction between minimizing bending energy versus Procrustes distance is critical for selecting appropriate methodologies in outline-based identification research targeting scientific and drug development applications.

Theoretical Foundations of Sliding Criteria

Bending Energy Minimization

The minimization of bending energy is rooted in the physics of deforming an infinitely thin metal plate, where bending energy represents the energy required to deform a hypothetical metal plate defined by the landmark configuration [15]. In practical terms, this approach slides semi-landmarks to minimize the bending energy of the thin-plate spline (TPS) interpolation between the reference form and the target specimen. This method emphasizes local shape differences by assigning greater influence to landmarks and semi-landmarks in close spatial proximity [15].

Mathematically, bending energy is defined through the partial differential equations governing thin-plate spline deformation. When sliding semi-landmarks via bending energy minimization, the algorithm iteratively adjusts point positions along tangent directions to the curve or surface until the energy function reaches a local minimum. This approach is particularly advantageous for capturing localized morphological variation and is generally less influenced by distant landmarks on different anatomical structures [15].

Procrustes Distance Minimization

In contrast, the Procrustes distance minimization approach slides semi-landmarks to minimize the Procrustes distance between the specimen and a reference form, typically the Procrustes consensus shape [15]. This method considers global shape differences equally across all landmarks in the configuration, as it minimizes the sum of squared distances between corresponding landmarks after Procrustes superimposition.

The Procrustes distance represents the square root of the sum of squared differences between corresponding landmark positions after optimal superimposition via translation, rotation, and scaling. When this criterion guides the sliding process, all landmarks and semi-landmarks contribute equally to the minimization function, regardless of their spatial relationships. This global consideration can be beneficial for capturing overall shape differences but may sometimes overlook localized variations in densely landmarked regions [15].

Table 1: Comparative Analysis of Sliding Criteria

Parameter	Bending Energy Minimization	Procrustes Distance Minimization
Theoretical Basis	Physics of thin metal plate deformation	Least-squares Procrustes geometry
Spatial Influence	Localized (weighted by proximity)	Global (equal weighting)
Computational Complexity	Generally higher due to TPS calculations	Generally lower
Sensitivity to Landmark Density	Less sensitive to uneven landmark distribution	More sensitive to landmark spacing
Biological Interpretation	Better for localized morphological features	Better for overall shape differences
Recommended Application	Analyses requiring localized shape comparison	Studies focusing on global form variation

Comparative Theoretical Considerations

The choice between these sliding criteria involves important theoretical trade-offs. Bending energy minimization, with its emphasis on local shape changes, may provide more biologically meaningful correspondence in regions with smoothly varying morphology [15]. The localization of influence means that landmarks on separate structures (e.g., different bones) have minimal effect on each other's sliding paths.

Procrustes distance minimization, while computationally simpler in concept, may sometimes introduce artifacts when semi-landmarks are spaced unevenly or when analyzing structures with significant global shape differences [15]. However, it provides a direct connection to the Procrustes superimposition framework that underpins most geometric morphometric analyses.

Recent methodological studies suggest that the practical differences between these approaches may be context-dependent, influenced by factors including the complexity of the anatomical structure, density of semi-landmarks, and degree of shape variation within the sample [15]. In some applications, researchers may employ both methods comparatively to assess the robustness of their findings to sliding criterion selection.

Quantitative Comparison of Sliding Methods

Table 2: Performance Metrics for Sliding Approaches in Morphological Studies

Study Reference	Anatomical System	Sample Size	Sliding Method	Reported Outcome
Davis & Maga (2018) [1]	Great ape crania	51 specimens	Patch-based semi-landmarks	Improved shape estimation over manual landmarks alone
Shui et al. (2023) [15]	Ape crania and human heads	Multiple datasets	Bending Energy vs. Procrustes	Different landmark locations lead to statistical differences
PMC (2025) [16]	Human nasal cavity	151 nasal cavities	Bending energy minimization	Successful identification of morphological clusters
Landmark-free study (2025) [11]	Mammalian crania	322 specimens	Landmark-free vs. traditional	Comparable phylogenetic signal with manual landmarking

Empirical evidence from recent studies demonstrates the practical implications of selecting different sliding criteria. Research on great ape cranial morphology implemented semi-landmark approaches using thin-plate spline deformation for transferring landmarks between templates and target specimens [1]. This bending energy-based approach successfully captured morphological variation across species, though the study noted potential methodological sensitivities to surface noise and missing data.

A comprehensive comparison of semi-landmarking approaches revealed that while different methods (including varying sliding criteria) generally produce congruent patterns of shape variation, notable differences emerge in statistical results [15]. The authors emphasized that analyses employing semi-landmarks must be interpreted with caution, recognizing that all methods introduce some degree of approximation, and the choice of sliding criterion represents one source of methodological variability.

In pharmaceutical applications, researchers applying geometric morphometrics to nasal cavity morphology successfully employed bending energy minimization in their sliding protocol [16]. This approach enabled identification of distinct morphological clusters relevant for optimizing nose-to-brain drug delivery, demonstrating the real-world impact of appropriate sliding criterion selection in drug development contexts.

Experimental Protocols for Semi-Landmark Alignment

Generalized Workflow for Semi-Landmark Sliding

Diagram 1: Semi-landmark sliding workflow. The process begins with raw data, proceeds through Procrustes alignment and criterion selection, and iterates until convergence.

Protocol 1: Bending Energy Minimization

Purpose: To slide semi-landmarks by minimizing the bending energy of the thin-plate spline transformation between each specimen and a reference form.

Materials and Software:

3D surface meshes or 2D outline data
Landmarking software (e.g., Viewbox, 3D Slicer, MorphoJ, R geomorph package)

Procedure:

Initial Configuration: Place fixed landmarks and initial semi-landmarks on curves or surfaces using template-based propagation or manual placement [16].
Procrustes Superimposition: Perform Generalized Procrustes Analysis (GPA) to remove non-shape variation (position, orientation, scale) from fixed landmarks only.
Reference Selection: Choose an appropriate reference form (often the Procrustes consensus) for the thin-plate spline calculation.
Tangent Direction Estimation: For each semi-landmark, compute the tangent vectors to the curve or surface at its current location.
Iterative Sliding: a. For each specimen, calculate the thin-plate spline mapping from the reference to the specimen using current landmark positions. b. Compute the bending energy matrix for this transformation. c. For each semi-landmark, determine the optimal shift along the tangent direction that minimizes bending energy. d. Update semi-landmark positions according to these shifts.
Convergence Check: Repeat step 5 until the change in semi-landmark positions falls below a predetermined threshold (typically 0.0001 units in Procrustes space).
Final Alignment: Perform a final GPA including the slid semi-landmarks to obtain the aligned coordinate data for statistical analysis.

Technical Notes: The bending energy is computed from the thin-plate spline bending energy matrix, which incorporates the spatial relationships between all landmarks. This method gives greater weight to landmarks in close proximity to each semi-landmark being slid [15].

Protocol 2: Procrustes Distance Minimization

Purpose: To slide semi-landmarks by minimizing the squared Procrustes distance between each specimen and a reference form.

Materials and Software:

3D surface meshes or 2D outline data
Landmarking software (e.g., Viewbox, 3D Slicer, MorphoJ, R geomorph package)

Procedure:

Initial Configuration: Place fixed landmarks and initial semi-landmarks as in Protocol 1, step 1.
Procrustes Superimposition: Perform GPA using fixed landmarks only to establish initial alignment.
Reference Selection: Compute the Procrustes consensus shape from all specimens using current landmark positions.
Tangent Direction Estimation: Compute tangent vectors for each semi-landmark as in Protocol 1, step 4.
Iterative Sliding: a. For each specimen, calculate the Procrustes distance to the reference consensus. b. For each semi-landmark, determine the optimal shift along the tangent direction that minimizes the Procrustes distance. c. Update semi-landmark positions according to these shifts. d. Update the Procrustes consensus using the new semi-landmark positions.
Convergence Check: Repeat step 5 until the change in semi-landmark positions falls below the predetermined threshold.
Final Alignment: Perform a final GPA including the slid semi-landmarks to obtain the aligned coordinate data for statistical analysis.

Technical Notes: This approach minimizes the sum of squared Euclidean distances between corresponding landmarks after optimal superimposition, giving equal weight to all landmarks regardless of spatial distribution [15]. The algorithm typically converges efficiently but may require more iterations when analyzing highly variable datasets.

Research Reagent Solutions for Semi-Landmark Studies

Table 3: Essential Tools and Software for Semi-Landmark Research

Tool/Software	Primary Function	Application Context	Access
Viewbox 4.0 [16]	Landmark digitization	Precise placement of fixed and semi-landmarks	Commercial
3D Slicer with SlicerMorph [1]	3D visualization and morphometrics	Medical image analysis, template projection	Open source
R geomorph package [16]	Statistical shape analysis	Procrustes ANOVA, PCA, phylogenetic comparisons	Open source
ITK-SNAP [16]	Medical image segmentation	Semi-automatic segmentation of 3D structures	Open source
FactoMineR [16]	Multivariate analysis	Principal component analysis, clustering	Open source
Morpho [1]	Geometric morphometrics	Sliding semi-landmarks, surface sampling	Open source

The implementation of semi-landmark alignment methods requires specialized software tools for data acquisition, landmark placement, and statistical analysis. Commercial software like Viewbox 4.0 provides integrated environments for precise landmark digitization and management of semi-landmark templates [16]. For pharmaceutical and medical applications, ITK-SNAP enables segmentation of anatomical structures from CT and MRI data, creating 3D meshes for subsequent landmarking [16].

Open-source solutions increasingly dominate methodological research in geometric morphometrics. The R statistical environment, particularly with the geomorph package, provides comprehensive implementations of both bending energy and Procrustes distance minimization approaches to sliding semi-landmarks [16]. The SlicerMorph extension for 3D Slicer offers specialized tools for high-density morphometric analysis, including patch-based semi-landmarking and template propagation methods [1].

When establishing a research pipeline for outline-based identification, researchers should consider interoperability between these tools, typically using standardized file formats (PLY, STL, LAND) to transfer landmark data between visualization software and statistical analysis environments.

Applications in Pharmaceutical and Biomedical Research

The sliding of semi-landmarks finds practical application across multiple domains of biomedical research, particularly in pharmaceutical development where precise anatomical characterization influences product design and efficacy. Geometric morphometric approaches employing semi-landmark sliding have been successfully implemented in nasal cavity morphology studies to optimize nose-to-brain drug delivery systems [16]. These studies identified distinct morphological clusters with differential accessibility to the olfactory region, enabling stratified approaches to drug device design.

In cranial morphology research, semi-landmark protocols have enabled large-scale comparative analyses across diverse taxa, facilitating evolutionary studies and morphological disparity assessments [11]. The ability to capture complex surface morphology through sliding semi-landmarks has proven particularly valuable for analyzing anatomical structures with limited discrete landmarks, such as neurocranial surfaces and dental crowns.

Recent methodological advances aim to extend these approaches to landmark-free morphometric methods, which use dense surface correspondence algorithms as alternatives to traditional landmark-based approaches [11]. While these methods show promise for analyzing highly disparate forms, they still face challenges in establishing biologically meaningful correspondences compared to landmark-guided approaches.

Diagram 2: Research applications and impacts. Semi-landmark methods support diverse applications from pharmaceutical development to evolutionary biology.

Methodological Considerations and Best Practices

When implementing semi-landmark sliding protocols, researchers should address several methodological considerations to ensure biologically meaningful results. The density of semi-landmarks represents a critical parameter, with insufficient sampling failing to capture morphological complexity and oversampling potentially introducing redundancy and computational burden [15]. Studies recommend conducting sensitivity analyses to determine optimal landmark density for specific research questions.

The choice between bending energy versus Procrustes distance minimization should align with research objectives and anatomical context. Bending energy minimization is generally preferred when analyzing localized morphological features or when landmarks are distributed across functionally distinct modules [15]. Procrustes distance minimization may be more appropriate for capturing overall shape differences or when analyzing structures with globally integrated morphology.

Template selection significantly influences results in semi-landmark studies [11]. Researchers should select templates representing the median morphology of their sample rather than extreme forms. For studies encompassing substantial morphological variation, iterative template selection or multiple template approaches may be necessary to minimize bias.

Recent methodological developments highlight the importance of validation and repeatability assessments in semi-landmark studies [16]. Researchers should quantify both intra- and inter-operator error through repeated landmarking procedures and report these metrics to establish methodological robustness. As landmark-free approaches continue to develop, they may offer complementary perspectives for analyzing highly disparate forms where homology assessment remains challenging [11].

In the field of geometric morphometrics, the quantitative analysis of shape variation has been transformed by methods that go beyond traditional landmarks. While anatomical landmarks provide crucial points of biological homology, they are often limited in number and cannot densely capture the information from curves or surfaces [2]. To address this, several advanced methodologies have been developed, primarily falling into three categories: semilandmarks, pseudolandmarks, and landmark-free methods. These approaches enable researchers to capture rich shape descriptions from complex biological structures, facilitating more comprehensive analyses of morphological variation in evolutionary biology, ecology, and related fields [1] [2]. This overview provides a comparative analysis of these methods, focusing on their theoretical foundations, practical applications, and implementation protocols for outline-based identification research.

Conceptual Definitions and Theoretical Foundations

Semilandmarks

Semilandmarks are points used to capture the shape of curves and surfaces where traditional landmarks are insufficient. They relax the requirement for strict biological homology while maintaining correspondence through geometric algorithms [1]. There are two primary sliding criteria for optimizing semilandmark placement:

Bending Energy (BE) Minimization: Assumes the contour on a specimen results from the smoothest possible deformation of the reference contour [13].
Procrustes Distance (D) Minimization: Aligns semilandmarks so they lie along lines perpendicular to the curve passing through corresponding points on the reference form [13].

Semilandmarks are typically applied using a template specimen, where they are transferred to target specimens via thin-plate spline (TPS) transformation followed by projection and sliding [1] [2].

Pseudolandmarks

Pseudolandmarks are points placed automatically on an image surface with no direct relationship to manually placed landmarks [1]. These methods transform surface meshes into clouds of points subjected to Procrustes superimposition, removing subjectivity in placement and significantly reducing processing time [2]. However, they do not ensure points are positioned in anatomically equivalent locations, limiting biological interpretability for region-specific analyses [2].

Landmark-Free Methods

Landmark-free approaches completely bypass the need for manual landmark identification, instead using algorithmic registration to compare shapes. These include:

Iterative Closest Point (ICP) Algorithms: Fit a template surface to each target via rigid registration [15] [7].
Deterministic Atlas Analysis (DAA): Uses Large Deformation Diffeomorphic Metric Mapping (LDDMM) to compute deformations between a mean atlas shape and each specimen [11].
Conformal Geometry Methods: Map surfaces conformably onto a sphere or unit disk to establish point correspondences [7].

These methods excel in efficiency for large datasets but may produce mappings with uncertain biological homology [15] [11].

Table 1: Core Characteristics of Dense Sampling Approaches in Geometric Morphometrics

Method	Basis of Homology	Required Input	Automation Level	Biological Interpretability
Semilandmarks	Geometric homology guided by landmarks	Manual landmarks + template	Semi-automated	High for defined regions
Pseudolandmarks	Spatial distribution on surface	Surface mesh only	Fully automated	Limited to overall shape
Landmark-Free	Algorithmic registration	Surface mesh only	Fully automated	Variable, requires validation

Methodological Protocols

Protocol 1: Patch-Based Semilandmark Placement

This approach generates semilandmarks within triangular patches defined by manual landmarks on each specimen independently [1].

Materials and Software:

3D surface meshes of specimens
3D Slicer with SlicerMorph extension [1]
Pre-placed anatomical landmarks

Procedure:

Define Patches: Select three manual landmarks to define the boundaries of each triangular region of interest.
Create Sampling Grid: Generate a template triangular grid with user-specified density within each patch.
Grid Registration: Register the sampling grid to the bounding triangle using thin-plate spline deformation.
Project to Surface:
- Smooth surface normal vectors using Laplacian smoothing
- Estimate patch orientation by averaging normal vectors at the three defining landmarks
- Cast rays in the projection vector direction to find intersections with the mesh surface
- If no intersection is found, reverse the ray direction or select the closest mesh point
Merge Patches: Combine all projected points, removing overlaps and adding manual landmarks to the final set.

Applications: Suitable for analyses where each specimen must be processed independently without a population template [1].

Protocol 2: Template-Based Semilandmark Transfer (Patch-TPS)

This method applies semilandmarks from a single template to all specimens in a dataset [1].

Materials and Software:

Template specimen (representative of population)
3D Slicer with SlicerMorph or MorphoJ software [1] [17]
Complete set of manual landmarks on all specimens

Procedure:

Template Preparation: Place semilandmarks on the template mesh using the patch-based method or manual digitization.
Warp Specimens: For each specimen, compute a thin-plate spline transformation based on manual landmark correspondences between the template and specimen.
Transfer Semilandmarks:
- For each semilandmark point on the template, cast a ray in the direction of the normal vector
- Find the intersection with the warped subject mesh
- If no intersection, reverse the normal vector or select the closest mesh point
Sliding Optimization: Slide semilandmarks along the surface to minimize bending energy or Procrustes distance relative to the template.

Applications: Ideal for consistent sampling across multiple specimens and population-level analyses [1] [2].

Protocol 3: Landmark-Free Analysis Using DAA

This protocol uses Deterministic Atlas Analysis for completely automated shape comparison [11].

Materials and Software:

Deformetrica software [11]
Surface meshes of all specimens (recommended: Poisson surface reconstruction for watertight meshes)
High-performance computing resources for large datasets

Procedure:

Mesh Standardization: Apply Poisson surface reconstruction to create watertight, closed surfaces for all specimens, particularly important with mixed imaging modalities.
Atlas Generation:
- Select an initial template (specimen with median shape recommended)
- Iteratively estimate the optimal atlas shape by minimizing total deformation energy across the dataset
Control Point Placement: Generate evenly distributed control points around the atlas, with density controlled by kernel width parameter.
Momentum Calculation: For each specimen, compute momentum vectors representing deformation trajectories from atlas to specimen.
Shape Comparison: Apply kernel Principal Component Analysis (kPCA) to momentum vectors to visualize and analyze shape variation.

Applications: Large-scale studies across disparate taxa where manual landmarking is impractical [11].

Comparative Analysis of Method Performance

Quantitative Comparison

Table 2: Performance Metrics of Different Dense Sampling Methods Based on Empirical Studies

Method	Sensitivity to Noise	Handling of Disparate Shapes	Computational Demand	Classification Accuracy
Patch-Based Semilandmarks	High sensitivity [1]	Moderate (requires manual patch definition)	Medium	Comparable to manual landmarks [1]
Template-Based Semilandmarks	Robust with proper template [1]	Good within defined regions	Medium to High	High for intraspecific variation [2]
Pseudolandmarks (auto3dgm)	Moderate [15]	Poor with large shape differences [15]	Low to Medium	Varies with disparity [15]
DAA (Landmark-Free)	Low with proper mesh processing [11]	Good across diverse taxa [11]	High	Comparable to landmarks with appropriate parameters [11]

Method Selection Guidelines

Choosing an appropriate method depends on research goals, dataset characteristics, and biological questions:

Region-Specific Developmental Questions: Template-based semilandmarks allow partitioning of biological structures into developmentally meaningful regions [2].
Overall Shape Analysis for Classification: Pseudolandmark or landmark-free methods provide efficient solutions, particularly for identification purposes [9] [18].
Macroevolutionary Studies Across Disparate Taxa: Landmark-free approaches like DAA overcome homology limitations when comparing phylogenetically distant forms [11].
Intraspecific Population Studies: Semilandmark methods capture subtle variations while maintaining biological interpretability [13].

The Scientist's Toolkit: Essential Research Reagents and Software

Table 3: Key Software Tools for Implementing Dense Sampling Methods

Software Tool	Primary Function	Compatible Methods	Access
3D Slicer with SlicerMorph [1]	Visualization and analysis	Semilandmarks (Patch and Patch-TPS)	Open source
Morpho [1]	Statistical shape analysis	Semilandmark sliding and analysis	R package
Geomorph [1]	GM analysis	Landmark and semilandmark analysis	R package
TPS Series [17]	Digitization and relative warps	Landmark and curve semilandmarks	Freeware
Deformetrica [11]	Diffeomorphic registration	Landmark-free (DAA)	Open source
auto3dgm [15]	Automated correspondence	Pseudolandmarks	Open source

Workflow Visualization

Method Selection Workflow for Dense Sampling Approaches

The choice between semilandmarks, pseudolandmarks, and landmark-free methods represents a trade-off between biological interpretability, efficiency, and applicability across morphologically disparate forms. Semilandmarks maintain a connection to biological homology through guidance by manual landmarks, making them suitable for studies of evolutionary development and modularity [2]. Pseudolandmarks offer increased automation but sacrifice anatomical correspondence [1] [2]. Landmark-free methods show promise for large-scale macroevolutionary studies but require careful validation against traditional approaches [11]. Future methodological development should focus on improving the biological relevance of automated methods while maintaining their efficiency advantages. As geometric morphometrics continues to evolve, researchers should select methods based on explicit consideration of their assumptions and limitations relative to specific biological questions.

A Practical Guide to Major Semilandmarking Algorithms and Their Implementation

In geometric morphometrics, the accurate quantification of biological shape is often limited when structures lack sufficient homologous anatomical landmarks. This is particularly relevant for outline-based identification research, where smooth curves and surfaces host the biologically significant shape variation. The sliding semilandmarks method was developed to address this challenge by allowing the quantification of homologous curves and surfaces between traditional landmarks [15] [19]. This protocol details the standard approach using Thin-Plate Spline (TPS) and Procrustes optimization, which has become foundational for analyzing complex biological shapes in evolutionary biology, comparative anatomy, and related fields [8] [20].

The core principle involves placing points along curves or surfaces to capture geometric form, then algorithmically "sliding" them to minimize either bending energy or Procrustes distance. This process establishes geometric correspondence across specimens, enabling statistically robust comparisons of shape variation [19]. This document provides detailed methodologies and practical implementations for researchers applying these techniques to outline-based identification studies.

Key Concepts and Definitions

Semilandmarks: Points placed on curves or surfaces to capture geometric form between traditional anatomical landmarks. They lack true biological homology but establish geometric correspondence across specimens [15].
Thin-Plate Spline (TPS): A mathematical function used for spatial interpolation that models the deformation of an infinitely thin metal plate. In morphometrics, it defines the warp between landmark configurations [19].
Procrustes Distance: The square root of the sum of squared differences between the positions of corresponding landmarks after Procrustes superimposition. Minimizing this distance optimizes landmark correspondence [15].
Bending Energy: A measure of the complexity of the deformation required to warp one landmark configuration to another. Lower bending energy indicates a smoother, more biologically plausible transformation [19].
Procrustes Superimposition: A procedure that removes differences in position, scale, and orientation from landmark configurations to isolate shape variation [20].

Theoretical Foundation

The Mathematical Workflow of Sliding Semilandmarks

The standard sliding semilandmarks protocol involves a sequence of coordinated steps to establish geometric correspondence. The following diagram illustrates the complete workflow and logical relationships between these steps:

Optimization Criteria in Semilandmark Sliding

The sliding process can be guided by two primary optimization criteria, each with distinct mathematical properties and biological interpretations:

Bending Energy Minimization: This approach slides semilandmarks to minimize the bending energy of the TPS required to deform the reference configuration to each specimen's configuration. It emphasizes local shape changes and is particularly effective for modeling smooth biological transformations [19]. The bending energy is calculated from the integral of the second derivatives of the TPS interpolation function.
Procrustes Distance Minimization: This method slides semilandmarks to minimize the Procrustes distance between specimens. It provides a global optimization of point correspondence and is computationally efficient for large datasets [15]. The Procrustes distance represents the square root of the sum of squared differences between corresponding landmark positions after superimposition.

Experimental Protocols

Comprehensive Protocol for Sliding Semilandmarks

Specimen Preparation and Initial Landmarking

Data Acquisition: Obtain high-resolution 3D surface scans or images of all specimens. Ensure consistent orientation and scale across the dataset [20].
Anatomical Landmark Placement: Identify and digitize traditional anatomical landmarks that represent biologically homologous points across all specimens. These serve as fixed reference points for the subsequent semilandmarking process [21].
Template Construction: Select a representative specimen as a template. On this template, define curves and surfaces between anatomical landmarks where additional shape information is needed [1].

Semilandmark Placement and Sliding

Initial Semilandmark Placement: Place semilandmarks equidistantly along curves or in a grid pattern on surfaces between the fixed anatomical landmarks. The density should capture the morphological detail required for the research question [20].
Semilandmark Transfer: Project the template's semilandmarks onto each target specimen using TPS transformation based on the fixed anatomical landmarks [1].
Iterative Sliding Procedure:
- Perform Generalized Procrustes Analysis (GPA) on the complete landmark set (fixed landmarks and semilandmarks) to compute a consensus configuration.
- Slide each semilandmark along tangents to the curve or surface to minimize either bending energy or Procrustes distance.
- Recompute the consensus and repeat the sliding process iteratively until convergence is achieved [19].
Final GPA: Conduct a final Procrustes superimposition on the optimized landmark configurations to obtain shape coordinates for statistical analysis.

Critical Parameters and Optimization

Table 1: Key Parameters in Sliding Semilandmarks Protocol

Parameter	Considerations	Recommended Values
Semilandmark Density	Trade-off between shape capture and statistical power; too few points miss biological information, too many reduce statistical power and increase processing time [20]	Varies by structure complexity; 8-16 points per curve segment; surface grids spaced 1-5mm apart
Number of Iterations	Higher iterations increase processing time but do not necessarily improve accuracy; optimal number exists where sliding becomes optimally relaxed [19]	12 iterations recommended for facial analysis; convergence should be monitored for different datasets
Optimization Criterion	Bending energy emphasizes local shape changes; Procrustes distance provides global optimization [15]	Choice depends on research question; bending energy preferred for modeling smooth biological transformations

Research on 3D human facial images demonstrated that classification accuracy is affected by the number of iterations but not in a progressive pattern. Stability was observed at 12 relaxation states with the highest accuracy of 96.43%, with an unchanging decline after this point [19]. This indicates that a specific number of iterations exists where sliding becomes optimally relaxed, beyond which no significant improvement occurs.

The Scientist's Toolkit

Essential Software and Research Reagents

Table 2: Essential Research Reagent Solutions for Sliding Semilandmarks

Tool Category	Specific Software/ Package	Function	Application Context
Comprehensive Morphometrics Platforms	Viewbox [19]	Integrated environment for digitizing, sliding semilandmarks, and visualization	All-in-one solution for end-to-end geometric morphometric analysis
	EVAN Toolbox [19]	Open-source platform for semilandmark placement and sliding	Accessible option for academic research
R Packages	geomorph [1] [21]	Sliding semilandmarks, statistical shape analysis, visualization	Primary tool for statistical analysis of landmark data
	Morpho [19]	Sliding semilandmarks, Procrustes analysis, and mesh processing	Alternative R package with comprehensive functionality
Digitization Tools	StereoMorph [21]	Digitize landmarks and curves with Bezier curve fitting	Streamlined initial landmark placement, especially for curves
	3D Slicer / SlicerMorph [1]	Place patches of semilandmarks on 3D surfaces	Flexible semilandmarking for complex biological structures

Implementation Considerations for Outline-Based Identification

For outline-based identification research, specific considerations enhance the effectiveness of the sliding semilandmarks approach:

Curve Definition: Carefully define outline curves using sufficient fixed landmarks at biologically meaningful points to anchor the semilandmarks [21].
Template Selection: Choose a template specimen that represents the average morphology of the dataset or has particularly clear anatomical features [1].
Sliding Method Selection: For outline analysis, Procrustes distance minimization often provides more consistent results when comparing highly variable forms [15].
Validation: Implement cross-validation procedures to ensure that the sliding process does not introduce artifacts and accurately captures biologically relevant shape variation [19].

The standard approach to sliding semilandmarks using TPS and Procrustes optimization provides a powerful method for quantifying shape variation in outline-based identification research. By implementing the detailed protocols outlined in this document and utilizing the appropriate software tools, researchers can consistently capture and analyze complex biological forms. The method's strength lies in its ability to establish geometric correspondence across specimens, enabling rigorous statistical analysis of shape variation. As with any methodological approach, careful consideration of parameters such as semilandmark density and iteration number is essential for generating biologically meaningful results. When properly implemented, this technique significantly enhances our ability to investigate subtle patterns of morphological variation in evolutionary and taxonomic studies.

In geometric morphometrics (GM), the analysis of biological form often relies on landmarks—discrete, homologous points that can be reliably identified across specimens. However, many biological structures are characterized by extensive smooth curves and surfaces lacking such discrete points [15]. Template-based workflows, utilizing patch sampling and Thin-Plate Spline (TPS) warping, address this limitation by generating dense correspondences of semi-landmarks across specimens. These methods are essential for increasing the density of shape information and enabling rigorous statistical analyses of outline-based morphological variation [1]. Within outline-based identification research, such as distinguishing between closely related species or classifying nutritional status from body shapes, these workflows provide a reproducible framework for capturing and comparing complex morphologies, thereby improving the accuracy and biological relevance of the findings [9] [18].

Theoretical Foundation

From Landmarks to Semi-Landmarks

The core challenge in geometric morphometrics is establishing point correspondences across different specimens. While anatomical landmarks represent biologically homologous points, their sparse distribution inadequately captures the shape of entire surfaces or outlines [1]. Semi-landmarks relax the strict requirement of homology; they are points that are matched algorithmically based on their relative position along a curve or on a surface between traditional landmarks [15]. Their placement is guided by the principle of sliding, which iteratively adjusts their positions to minimize a specific energy function (either bending energy or Procrustes distance) relative to a sample mean, thus reducing the artifactual variance introduced by the initial arbitrary placement [1] [15].

Thin-Plate Spline (TPS) Theory

The Thin-Plate Spline is a mathematical fundamental to many semi-landmarking workflows. It is a spline-based interpolation function that provides a seamless and smooth mapping from one set of landmark points to another [22]. Conceptually, it defines a transformation that minimizes the bending energy required to warp a template configuration into a target configuration. This property makes it ideal for biological shape modeling, as it mimics the smooth, continuous deformations observed in nature. In template-based workflows, the TPS transform derived from a few corresponding anatomical landmarks is used to transfer a dense cloud of semi-landmarks from a template specimen onto a target specimen, ensuring consistent and comparable sampling [1].

Implemented Methodologies

This section details the core protocols for implementing patch sampling and TPS warping, outlining two distinct strategies for semi-landmark generation.

Patch-Based Semi-Landmarking

The patch-based method generates semi-landmarks directly on each specimen without requiring a prior template, preserving a direct geometric relationship with the manually placed landmarks [1].

Experimental Protocol: Direct Patch Sampling

Landmark Definition: Manually place anatomical landmarks on the specimen's 3D surface mesh using software such as 3D Slicer with the SlicerMorph extension [1].
Patch Delineation: Define triangular regions of interest on the surface by selecting sets of three manual landmarks that bound the area to be sampled.
Grid Generation: For each triangular patch, create a template grid with a user-specified density of semi-landmark points.
TPS Registration & Projection: Register the 2D template grid to the 3D vertices of the bounding triangle using a TPS deformation. Project the grid points onto the specimen's surface mesh:
- Smooth the surface mesh using Laplacian smoothing to mitigate noise.
- Calculate a projection vector as the average of the surface normal vectors at the three bounding landmarks.
- Cast a ray from each grid point in the direction of the projection vector to find its intersection with the surface mesh.
Merging and Completion: Merge the semi-landmarks from all individual patches, ensuring no overlap at the boundaries. Add the original manual landmarks to the final set.

Table 1: Strengths and Limitations of Patch-Based Sampling

Aspect	Description
Strengths	Does not require a pre-defined template; each specimen is processed independently. The geometric relationship of each semi-landmark to its bounding landmarks is known.
Limitations	Coverage is dependent on the placement of manual landmarks. Sensitive to surface noise and sharp curvatures, which can lead to projection errors (e.g., sampling an interior surface). The process can be computationally expensive for large datasets.

Template-Based Workflow (Patch-TPS)

The Patch-TPS method leverages a single template specimen to generate semi-landmarks, which are then propagated to all other specimens in a dataset. This approach enhances robustness and consistency [1].

Experimental Protocol: Template-Based Semi-Landmarking

Template Selection: Choose a representative specimen (or a synthetic average template) from the dataset to serve as the reference.
Template Landmarking: Apply the direct patch-based semi-landmarking method (Protocol 3.1) exclusively to the template to generate a comprehensive master set of semi-landmarks.
Target-to-Template Warping: For each target specimen, calculate a TPS transformation based on the correspondence between its manual landmarks and those of the template.
Semi-Landmark Transfer: Apply the TPS transform to warp the entire set of template semi-landmarks onto the target specimen's space.
Surface Projection: Refine the position of the warped semi-landmarks by projecting them onto the target specimen's actual surface along the template's surface normal vectors.

Table 2: Comparison of Semi-Landmarking Strategies

Method	Correspondence	Robustness to Noise	Required Input
Patch-Based	Geometric relationship to manual landmarks on each specimen	Lower	Manual landmarks on every specimen
Patch-TPS	Defined by the template and TPS transform	Higher	Manual landmarks on every specimen + a pre-marked template
Pseudo-Landmark Sampling	Arbitrary, no biological relationship	High in tested scenarios [1]	Manual landmarks for TPS + a template mesh

The following workflow diagram illustrates the key steps and decision points in the Patch-TPS protocol:

Diagram 1: Template-Based Semi-Landmark Workflow. This diagram outlines the process of using a single, well-landmarked template to generate consistent semi-landmarks across multiple target specimens via TPS warping and surface projection.

Application Notes for Outline-Based Identification

The consistent application of template-based workflows is critical for the reliability of outline-based identification research, such as distinguishing between morphologically similar species or classifying patient nutritional status [9] [18].

Template Choice is Critical: The selection of the template specimen profoundly influences the final results. The template should have the greatest overall geometric similarity to the members of the study sample and must be free of deformations or artifacts. An inappropriate template can lead to large mapping errors, where semi-landmarks are projected onto incorrect anatomical features in target specimens, especially when shape differences are substantial [15].
Managing Sliding and Alignment: After the initial placement via TPS warping, semi-landmarks are typically slid to minimize either bending energy or Procrustes distance to the sample mean. This step reduces the artifactual variance introduced by the initial placement. For classification tasks with out-of-sample data, the Procrustes alignment of a new specimen must be performed relative to the original training sample's mean shape to ensure the new shape coordinates are in the correct space for classification [18].
Handling Out-of-Sample Data: A common challenge in applied morphometrics is classifying new specimens that were not part of the original study sample. The classification rule is built in the shape space of the training sample. Therefore, the raw landmark (and semi-landmark) coordinates of a new specimen must be registered to that specific shape space before classification. This is achieved by using the study sample's mean shape or a representative template as the target for TPS-based registration of the new individual's data [18].

The Scientist's Toolkit

Table 3: Essential Research Reagents and Software Solutions

Item Name	Type	Function in Workflow
3D Slicer with SlicerMorph	Software Extension	An open-source platform for visualization and analysis of 3D medical images; the SlicerMorph extension provides specific tools for GM, including the patch and TPS semi-landmarking methods described here [1].
R with Geomorph/Morpho	Software Package / Statistical Environment	The R programming language, with packages `geomorph` and `Morpho`, is the standard for performing statistical shape analysis, including Generalized Procrustes Analysis (GPA), sliding semi-landmarks, and multivariate statistics [1].
Thin-Plate Spline (TPS)	Algorithm	The core mathematical algorithm for non-rigidly warping a template configuration of landmarks to fit a target configuration, minimizing bending energy [22] [1].
Manual Landmark Set	Data	A set of biologically homologous, manually identified points that serve as the fixed foundational correspondence between all specimens and guide the placement of semi-landmarks [1] [15].
Template Specimen	Data / Reference Model	A representative 3D model (surface mesh) of the structure under study, which is densely sampled with semi-landmarks to create a master set for propagation to all other target specimens [1].

Performance and Validation

The success of a semi-landmarking method can be quantified by evaluating how well the transformed mesh of a specimen estimates a known target, such as the population average template. The average mean root squared error between the transformed mesh and the template is a key metric for this purpose [1].

Table 4: Quantitative Performance Comparison

Method	Performance Characteristics (vs. Manual Landmarks)
Manual Landmarks Alone	Baseline. Provides limited shape information from sparse points [1].
Patch-Based Semi-Landmarking	Can produce shape estimates comparable to manual landmarks but demonstrates high sensitivity to noise and missing data, potentially leading to outliers with large deviations [1].
Patch-TPS Semi-Landmarking	Provides robust performance in the presence of noise and dataset variability. Generally offers a favorable balance between accuracy and robustness [1].
Pseudo-Landmark Sampling	Shows high robustness to noise. Offers an alternative when dense, regular sampling is prioritized over a direct relationship to manual landmarks [1].

Validation studies have confirmed the practical utility of these methods. For instance, warping a template MRI to an individual's head shape using TPS has been shown to enable MEG source localization accuracy comparable to that obtained with the subject's real MRI, demonstrating its viability for applications where sub-centimeter spatial accuracy is sufficient [22]. Furthermore, in outline-based species identification, the shape of wing cell contours analyzed through geometric morphometrics achieved a classification accuracy of 86.67%, highlighting the power of these methods for discrimination tasks [9].

In geometric morphometrics and outline-based identification research, the analysis of biological forms often involves structures that lack sufficient traditional landmarks for comprehensive comparison. Landmark-free methods have emerged as essential tools for establishing dense point correspondences across specimens, enabling the quantitative study of shape variation in otherwise challenging anatomical structures. These methods are particularly valuable for analyzing outlines of arthropods, certain bone structures, and other biological forms where defined homologous points are scarce or absent [7] [23]. The shift toward automated landmark-free approaches represents a significant methodological advancement, allowing researchers to study a wider range of organisms and anatomical features that were previously inaccessible to traditional landmark-based geometric morphometrics.

This document provides detailed application notes and experimental protocols for three prominent automated landmark-free methods: Iterative Closest Point (ICP), Non-rigid Iterative Closest Point (NICP), and Deterministic Atlas Analysis (DAA). These methods enable the quantification of shape variation through different computational strategies for establishing point correspondences without relying on pre-defined anatomical landmarks. ICP performs rigid alignment to minimize distances between point clouds, NICP extends this approach to accommodate non-rigid deformations, and DAA utilizes a standardized reference framework for spatial normalization and comparison. Together, these methods form a critical toolkit for researchers investigating outline-based identification in evolutionary biology, taxonomy, medical entomology, and related fields where traditional landmarks are insufficient for comprehensive shape analysis.

Comparative Analysis of Methods

The following table provides a systematic comparison of the three automated landmark-free methods, highlighting their core principles, advantages, limitations, and primary applications in biological research.

Table 1: Comparative characteristics of ICP, NICP, and DAA methods

Characteristic	ICP (Iterative Closest Point)	NICP (Non-rigid ICP)	DAA (Deterministic Atlas Analysis)
Core Principle	Rigid alignment via iterative minimization of point-to-point distances	Non-rigid deformation using regularization to preserve mesh properties	Registration to standardized atlas framework with predefined coordinates
Mathematical Foundation	Least-squares optimization of rotation/translation matrices	Regularized optimization with stiffness constraints	Linear and nonlinear spatial normalization algorithms
Primary Advantages	Computationally efficient; simple implementation; guaranteed convergence	Handles elastic deformations; better for biological shape variation	Standardized comparison across populations; intuitive anatomical interpretation
Key Limitations	Only rigid transformations; sensitive to initial alignment; poor for flexible structures	Computationally intensive; parameter sensitivity; potential over-deformation	Atlas selection bias; limited individual variation capture; template dependency
Optimal Use Cases	Alignment of rigid structures (e.g., bones, teeth); preliminary registration	Comparing structures with elastic deformation (e.g., soft tissues, growth series)	Population-level studies; clinical applications; multi-site data integration
Computational Complexity	Low to moderate (O(n log n) with k-d trees)	High (O(kn log n) with multiple iterations)	Moderate (depends on registration algorithm complexity)
Landmark Requirements	None required; can incorporate if available	None required; can incorporate if available	Dependent on atlas landmark schema
Output	Rigid transformation matrix; registered point cloud	Deformed point cloud; correspondence mapping	Normalized coordinates; quantitative atlas measurements
Error Metrics	Mean squared error; Hausdorff distance	Distance after deformation; bending energy	Distance to atlas norm; z-scores for populations

Each method offers distinct advantages for particular research scenarios. ICP provides a computationally efficient approach for rigid alignment but fails to capture the non-rigid deformations common in biological specimens [24]. NICP addresses this limitation by allowing elastic transformations, making it suitable for comparing structures with natural shape variations, though at higher computational cost [7]. DAA facilitates standardized comparisons across populations and studies but introduces template dependency that may constrain the capture of individual variation [25]. The choice among these methods should be guided by research objectives, specimen characteristics, and computational resources.

Experimental Protocols

Protocol 1: Iterative Closest Point (ICP) Alignment

Application Context: Rigid registration of 3D point clouds from biological specimens with minimal elastic deformation (e.g., insect wings, bone fragments, arthropod exoskeletons).

Materials and Reagents:

Specimens: 3D surface scans or point clouds of biological structures
Software: MeshLab, CloudCompare, or custom Python scripts with Open3D library
Computational Resources: Standard workstation with adequate RAM for point cloud processing

Experimental Procedure:

Data Preprocessing:
- Load source point cloud (specimen to align) and target point cloud (reference specimen)
- Apply initial coarse alignment using principal component analysis (PCA) to align major axes
- Optionally apply uniform sampling to reduce point density while preserving shape characteristics

ICP Configuration:
- Set distance threshold to reject point pairs exceeding biological plausibility
- Configure convergence criteria (typical values: iteration limit=50, relative change threshold=1e-6)
- Select point-to-point or point-to-plane distance metric based on surface characteristics
Iterative Alignment:
- For each iteration until convergence:
  - Establish point correspondences using nearest neighbor search (k-d trees)
  - Estimate optimal rigid transformation (rotation and translation) using singular value decomposition (SVD)
  - Apply transformation to source point cloud
  - Calculate mean squared error between corresponding points
  - Check convergence criteria
Validation:
- Visualize alignment quality using color-coded distance maps
- Calculate Hausdorff distance between registered point clouds
- Compare with ground truth alignment if available

Technical Notes: ICP performance is highly sensitive to initial alignment. For biological specimens with pronounced shape variation, consider landmark-guided coarse registration prior to ICP. The method assumes largely congruent shapes with minimal elastic deformation, making it suitable for rigid structures but limited for soft tissues or structures with significant individual variation [7].

Protocol 2: Non-rigid Iterative Closest Point (NICP) Registration

Application Context: Establishing dense correspondences between biological specimens with elastic deformations (e.g., comparing human faces, soft tissue structures, or specimens with growth-based shape changes).

Materials and Reagents:

Specimens: 3D surface meshes with consistent topology
Software: Custom implementation using C++ or Python with Trimesh/NumPy/SciPy libraries
Computational Resources: High-performance workstation with significant RAM (16GB+) for large meshes

Experimental Procedure:

Mesh Preparation:
- Ensure watertight meshes with consistent vertex ordering
- Apply Laplacian smoothing to reduce noise while preserving features
- Calculate vertex normals for point-to-plane distance computation

NICP Parameters:
- Set stiffness weight (typical range: 1e-3 to 1e-1) to control deformation flexibility
- Configure correspondence threshold (2-3 times average edge length)
- Select appropriate number of iterations (typically 20-100 based on complexity)
Non-rigid Registration:
- For each iteration until convergence:
  - Find closest points between source and target meshes
  - Build linear system combining data term (distance minimization) and regularization term (stiffness constraint)
  - Solve for optimal vertex displacements using sparse linear solver (e.g., LU decomposition)
  - Apply displacements to source mesh vertices
  - Update stiffness weight (optionally decrease for finer adjustments in later iterations)
Correspondence Transfer:
- Transfer semantic labels from template to target using established correspondences
- Extract displacement fields for shape analysis
- Calculate strain tensors for local deformation analysis

Technical Notes: NICP is computationally intensive, with processing times ranging from minutes to hours per specimen pair depending on mesh complexity [24]. Stiffness parameters significantly impact results - higher values preserve global shape but limit deformation capture. For biological applications, balance between precise fitting and maintaining biologically plausible deformations. Recent benchmarks show NICP achieves high correlation (>0.90) with true error when landmarks are available [24].

Protocol 3: Deterministic Atlas Analysis (DAA)

Application Context: Population-level shape analysis using a standardized coordinate system, particularly valuable for white matter pathway segmentation in neuroimaging and cross-species morphological comparisons.

Materials and Reagents:

Reference Atlas: Population-based template with standardized coordinate system
Specimens: 3D image data or surface meshes for registration
Software: FSL, ANTs, SPM, or specialized tractography tools for neuroimaging applications
Computational Resources: Workstation with adequate GPU acceleration for image registration

Experimental Procedure:

Atlas Selection and Customization:
- Select appropriate atlas template matching specimen characteristics (species, age, sex)
- Customize atlas coordinates based on study population if necessary
- Define region of interest (ROI) masks for focused analysis

Spatial Normalization:
- Perform affine registration to account for global size and position differences
- Apply nonlinear deformation using diffeomorphic registration (e.g., SyN algorithm)
- Validate registration quality using overlay visualization and similarity metrics
Coordinate Transformation:
- Apply computed transformation to specimen data
- Resample specimen into atlas space using appropriate interpolation
- Extract values from atlas-defined regions for quantitative analysis
Statistical Analysis:
- Calculate z-scores relative to atlas population norms
- Perform voxel-based or surface-based morphometry as needed
- Conduct cross-group comparisons using atlas coordinates

Technical Notes: DAA effectiveness depends heavily on atlas appropriateness for the target population. Methods for white matter segmentation demonstrate that clear anatomical definitions in protocols are essential for reproducible results [25]. Template selection should prioritize representative population characteristics rather than single exemplars. For emerging research areas without established atlases, consider creating study-specific templates from representative specimens.

Methodological Visualizations

ICP Algorithm Workflow

Diagram 1: ICP algorithm iterative workflow

NICP Deformation Process

Diagram 2: NICP deformation and correspondence establishment

DAA Registration Pipeline

Diagram 3: DAA spatial normalization pipeline

Research Reagent Solutions

Table 2: Essential computational tools and resources for landmark-free analysis

Tool/Resource	Type	Primary Function	Application Context
CloudCompare	Open-source Software	3D Point Cloud Processing	ICP alignment and comparison of surface scans
MeshLab	Open-source Software	Mesh Processing and Editing	Mesh preprocessing and visualization for NICP
Open3D	Python Library	3D Data Processing	Implementation of ICP and basic NICP variants
ANTs (Advanced Normalization Tools)	Software Library	Image Registration	DAA spatial normalization and atlas construction
FSL (FMRIB Software Library)	Software Suite	Brain Image Analysis	White matter DAA and tractography
auto3dgm	Algorithm Package	Landmark-free Geometric Morphometrics	ICP-based semilandmark placement on biological forms
Trimesh	Python Library	Mesh Operations	Mesh manipulation and basic processing for NICP
VTK (Visualization Toolkit)	Software Library	3D Visualization	Visualization of registration results and deformations

Critical Methodological Considerations

Performance and Error Assessment

Recent benchmarking studies provide critical insights into the performance characteristics of these automated methods. A modular evaluation of 3D face reconstruction methods revealed that ICP-based estimators can significantly alter the true ranking of top-performing reconstruction algorithms, with correlation to true geometric error as low as 0.41 in some configurations [24]. This highlights the importance of validation against ground truth where possible. In contrast, NICP approaches demonstrated substantially improved performance, achieving correlations greater than 0.90 with true error, particularly when guided by annotated landmarks [24].

Computational efficiency varies substantially between methods. ICP implementations typically process specimens in seconds to minutes, while NICP requires minutes to hours per specimen pair depending on mesh complexity [24]. This computational overhead must be considered when designing large-scale studies. For DAA, the initial atlas construction requires significant investment, but subsequent analyses benefit from standardized processing pipelines.

Biological Interpretation and Homology

A fundamental consideration in landmark-free methods is the relationship between algorithmic point correspondences and biological homology. Landmarks in traditional morphometrics represent points considered equivalent based on developmental or evolutionary criteria [7]. In contrast, semilandmarks generated by ICP, NICP, and DAA are defined by algorithmic optimization rather than biological criteria [7]. This distinction has important implications for biological interpretation.

While these methods efficiently capture overall shape variation, the resulting correspondences may not reflect true biological homology. As noted in comparative studies, "point correspondences identified without paying attention to homology have an uncertain relationship with the underlying processes responsible for differences in form" [7]. Researchers should therefore exercise caution when interpreting results in evolutionary or developmental contexts, particularly with landmark-free methods that prioritize spatial registration over biological correspondence.

Reproducibility and Protocol Standardization

Reproducibility remains a significant challenge in landmark-free analyses. Studies of white matter protocol reproducibility highlight that even detailed protocols can produce varying levels of intra-rater and inter-rater reproducibility [25]. Similar issues affect ICP, NICP, and DAA applications, where parameter selection, template choice, and implementation details can significantly impact results.

To enhance reproducibility, researchers should:

Document all parameters and preprocessing steps in detail
Provide access to custom code and implementation details
Report validation metrics against ground truth where available
Use established reference datasets for method comparison
Conduct sensitivity analyses for critical parameters

These practices are particularly important in outline-based identification research, where methodological differences can complicate cross-study comparisons and meta-analyses.

In geometric morphometrics, the analysis of biological shape has evolved from relying solely on manual anatomical landmarks to incorporating semi-landmarks: points that capture the geometry of curves and surfaces between traditional landmarks [1]. This shift is crucial for outline-based identification research, as it enables the quantification of subtle, yet biologically significant, shape variations that sparse manual landmarks cannot capture. The transition from manual digitization in established tools like TpsDig to automated pipelines in R and 3D Slicer represents a paradigm shift, enhancing reproducibility, scale, and statistical power. This article details practical protocols and application notes for implementing these modern, semi-automated semi-landmark alignment workflows within the 3D Slicer environment, supported by the SlicerMorph extension [1].

Key Research Reagent Solutions

The following table details the essential software tools and modules required to implement the semi-landmarking workflows described in this application note.

Table 1: Essential Software Tools for Geometric Morphometrics Pipelines

Item Name	Function/Application
3D Slicer	A free, open-source software platform for visualization, processing, segmentation, registration, and analysis of medical and biomedical 3D images and meshes [26] [27].
SlicerMorph Extension	An extension for 3D Slicer specifically designed for 3D geometric morphometrics, providing the modules necessary for landmarking and shape analysis [1].
R Statistical Environment	An open-source programming language and environment for statistical computing and graphics, essential for downstream statistical shape analysis [1].
Morpho & geomorph R Packages	R packages (e.g., `Morpho`, `geomorph`) used for statistical analysis of landmark data, including Generalized Procrustes Analysis (GPA) and other geometric morphometric operations [1].

Comparative Analysis of Semi-Landmarking Strategies

We implemented and evaluated three distinct dense sampling strategies for semi-landmark placement on 3D surface data of great ape crania, using the open-source platform 3D Slicer [1]. The goal was to quantify the trade-offs between different methods for capturing rich shape information. The performance of each method was evaluated by its ability to estimate a transform between an individual specimen and the population average template, with the average mean root squared error (MRSE) between the transformed mesh and the template serving as the performance metric.

Table 2: Performance Comparison of Semi-Landmarking Methods

Method	Key Principle	Advantages	Disadvantages/Limitations
Patch-based	Projects semi-landmarks from triangular patches constructed from manual landmarks onto the specimen's mesh surface [1].	Does not require a prior template; each specimen is processed independently with a known geometric relationship to manual landmarks [1].	Sensitive to noise and missing data; can result in outliers and potential misplacement on sharp edges or complex curvatures [1].
Patch-TPS	Applies semi-landmarks from a single template mesh to all specimens using a Thin-Plate Spline (TPS) transform and projection along template normals [1].	More robust to noise and dataset variability than the basic patch method; provides consistent landmark correspondence across specimens [1].	Dependent on the quality and representativeness of the chosen template; requires a complete and accurate template specimen [1].
Pseudo-landmark	Generates a dense set of points regularly sampled on a template model, then projects them to each specimen via TPS and normal projection [1].	Provides uniform sample coverage and consistent point spacing; robust performance and less sensitive to template choice [1].	Points have no biological or geometric relationship to original manual landmarks; homology is statistically inferred rather than geometrically defined [1].

Experimental Protocols

Protocol 1: Specimen Preparation and Image Processing in 3D Slicer

This protocol covers the initial steps of data preparation, from raw image stacks to 3D models ready for landmarking.

Data Import: Import DICOM stacks or other 3D image formats (e.g., .nrrd, .nii) directly into 3D Slicer. The software offers seamless DICOM standard interoperability [26] [27].
Volume Rendering and Segmentation: Use the "Segment Editor" module to create a 3D model of the anatomical structure of interest.
- Select a thresholding or region-growing algorithm to isolate the structure from the background.
- Manually clean the segmentation using tools like "Islands," "Erase," and "Smooth" to ensure a continuous, high-quality surface.
Model Generation and Smoothing: Generate a 3D surface model from the segmentation.
- Apply Laplacian smoothing to the mesh to reduce noise and artifacts while preserving the overall shape. This step is critical for improving the accuracy of subsequent semi-landmark projection [1].
Data Export: Export the final smoothed 3D model in a format suitable for landmarking (e.g., .ply, .vtk, .stl).

Protocol 2: Patch-Based Semi-Landmarking in SlicerMorph

This protocol details the steps for applying the patch-based semi-landmarking method to a single specimen.

Manual Landmarking: Load the 3D model into 3D Slicer. Using the "Markups" module, place all required anatomical (manual) landmarks on the model surface [26].
Define Patches: For each region of interest between manual landmarks, define a triangular patch by selecting three manual landmarks that bound the area.
Configure and Generate Grid:
- Specify the desired density (number of points) for the triangular sampling grid within the SlicerMorph module.
- Execute the patch generation. The algorithm will register the template grid to the bounding triangle using a thin-plate spline deformation and project the grid vertices onto the specimen's surface [1].
Merge and Export Landmarks: Once all patches are generated, merge the semi-landmarks from all grids with the original manual landmarks into a single set. Export the complete landmark set for statistical analysis.

Protocol 3: Template-Based Semi-Landmarking (Patch-TPS and Pseudo-Landmarks)

This protocol describes the workflow for applying semi-landmarks to a dataset using a template specimen, which enables consistent correspondence across samples.

Template Selection: Choose a representative specimen from your dataset to serve as the template. This specimen should be complete and of high quality.
Template Landmarking:
- For the Patch-TPS method, place all manual landmarks and then generate patch-based semi-landmarks on the template model following Protocol 2.
- For the Pseudo-landmark method, use the SlicerMorph module to generate a dense, regularly sampled set of points on the template model's surface, ensuring spherical topology and a user-specified minimum distance between points [1].
Landmark Transfer to New Specimen:
- Place only the manual landmarks on a new (target) specimen.
- Compute a Thin-Plate Spline transform that warps the template specimen to the target specimen based on their shared manual landmarks.
- Apply this transform to the template's semi-landmarks (from step 2).
- Project the transformed semi-landmarks onto the surface of the target specimen along the direction of the template's surface normal vectors [1].
Batch Processing and Export: Repeat step 3 for all specimens in the dataset. Export the complete landmark sets for Procrustes alignment and statistical analysis in R.

Workflow Visualization

The following diagram illustrates the logical decision process and steps involved in selecting and executing the appropriate semi-landmarking protocol, from data input to final analysis.

Semi-landmark Method Selection Workflow

The migration of geometric morphometrics workflows from manual digitization in TpsDig to automated, scalable pipelines within 3D Slicer and R marks a significant advancement for outline-based identification research. The SlicerMorph ecosystem provides robust, freely available tools for implementing sophisticated semi-landmark alignment methods, such as patch-based and pseudo-landmark sampling. By leveraging these open-source platforms, researchers can achieve higher throughput, improve reproducibility, and capture more comprehensive shape descriptions. This, in turn, empowers more powerful statistical analyses in R, ultimately driving deeper insights into morphological variation and classification in evolutionary biology, biomedicine, and beyond.

Application Note AN-001: Semi-Landmark Analysis of Great Ape Cranial Morphology

Experimental Background and Objectives

The analysis of cranial morphology in evolutionary biology has been revolutionized by the application of geometric morphometrics. Traditional landmark-based approaches often provide insufficient shape information due to the limited number of biologically homologous points that can be reliably identified across specimens [1]. This limitation is particularly pronounced when analyzing complex curved surfaces such as cranial vaults. Semi-landmark methods address this limitation by supplementing traditional landmarks with additional points that capture the geometry of curves and surfaces, thereby enabling more comprehensive quantification of morphological variation [1] [15]. This application note details the implementation and comparison of three semi-landmarking strategies for analyzing cranial morphology across three species of great apes: Pan troglodytes, Gorilla gorilla, and Pongo pygmaeus [1].

Quantitative Results and Performance Metrics

The performance of each semi-landmarking strategy was evaluated by quantifying how well the transformed mesh of an individual specimen matched the population average template. The metric used was the average mean root squared error (MRSE) between the transformed mesh and the template [1].

Table 1: Performance Comparison of Semi-Landmarking Methods for Ape Cranial Analysis

Method	Shape Estimation Accuracy	Robustness to Noise	Computational Demand	Key Advantages
Patch-Based	Comparable to manual landmarks	Low (sensitive to noise and missing data)	Moderate	Independent of template; direct geometric relationship to manual landmarks
Patch-TPS	Comparable or exceeds manual landmark accuracy	High	Moderate	Improved robustness; consistent coverage
Pseudo-Landmark	Comparable or exceeds manual landmark accuracy	High	High	Template-based; extensive coverage without manual landmark dependency

Detailed Experimental Protocol

Protocol 1.1: Patch-Based Semi-Landmarking for Cranial Morphology

Objective: To generate semi-landmarks directly on a specimen surface using triangular patches defined by manual landmarks.
Materials and Software: 3D Slicer with SlicerMorph extension [1]; 3D surface meshes of specimens.
Procedure:
- Patch Definition: For each region of interest, specify three manual landmarks that define the boundaries of a triangular patch on the specimen surface [1].
- Grid Generation: For each patch, register a template triangular grid with a user-specified number of semi-landmark points to the vertices of the bounding triangle using a thin-plate-spline (TPS) deformation [1].
- Surface Projection: Project the vertices of the triangular sampling grid to the specimen surface using a ray-casting algorithm. The projection vector direction is determined by averaging the surface normal vectors at the three manual landmarks defining the patch [1].
- Merge Patches: Combine all projected triangular grids into a single landmark set, ensuring no overlap between adjacent patches and adding the original manual landmarks to the final set [1].
Critical Steps:
- Laplacian smoothing of the surface is recommended prior to projection to mitigate the impact of surface noise [1].
- The ray length for projection should be constrained by the average distance between the vertices of the bounding triangle [1].

Protocol 1.2: Patch-TPS Semi-Landmarking

Objective: To transfer a single, consistently defined set of semi-landmarks from a template specimen to all specimens in a dataset.
Materials and Software: 3D Slicer with SlicerMorph extension [1]; 3D surface meshes; a single template specimen (synthetic or representative).
Procedure:
- Template Landmarking: Apply the patch-based method to a single template image to generate a comprehensive set of semi-landmarks [1].
- TPS Transformation: Warp each subject specimen to the template using a thin-plate-spline transformation defined by the manual landmark points shared between the template and specimen [1].
- Landmark Transfer: For each semi-landmark point on the template, cast a ray along the direction of the template's surface normal vector onto the warped subject mesh to identify the corresponding point [1].
- Point Selection: The final intersection with the warped subject mesh is selected as the corresponding semi-landmark. If no intersection is found, the closest mesh point is selected [1].
Critical Steps:
- The choice of template can influence results; a specimen with average morphology is often ideal [1].

Diagram 1: Workflow for the Patch-TPS Semi-Landmarking Method

Application Note AN-002: Semi-Landmark Analysis of Feather Shape for Age Discrimination

Experimental Background and Objectives

The quantitative analysis of feather shape serves as a powerful tool in ornithology for tasks such as discriminating between age classes within a species [28]. Many birds exhibit subtle, age-related changes in feather morphology that can be challenging to quantify through traditional measurement alone. This application note outlines a methodology for applying semi-landmark approaches to capture information from feather outlines, specifically for the purpose of classifying ovenbird (Seiurus aurocapilla) rectrices (tail feathers) into different age categories [28]. The approach compares the performance of different outline measurement and alignment methods in a Canonical Variates Analysis (CVA) to optimize classification rates.

Feather Morphology and Pattern Basis

The foundation of this analysis lies in the standardized organization of feathers. Contour feathers are arranged in overlapping rows, growing out from the body and curving back toward the tail [29]. The visible color patterns on a bird's plumage, such as streaks and spots, are created by specific markings on individual feathers. Streaks are formed when a dark marking extends to the tip of the feather, creating a continuous line as feathers overlap. Spots are created when the dark marking does not reach the tip, leaving a pale margin that isolates it from the marking on the feather behind it [29]. This structured arrangement means that the overall outline of a feather and its internal markings can be systematically captured using outline-based morphometric methods.

Quantitative Results and Method Comparison

The study evaluated different semi-landmark alignment methods and dimensionality reduction techniques to achieve optimal classification rates in a CVA, with performance assessed via cross-validation to minimize overfitting [28].

Table 2: Performance of Outline-Based Methods for Age Classification in Ovenbird Feathers

Method	Classification Rate (Cross-Validation)	Key Characteristics	Recommended Dimensionality Reduction
Semi-Landmark (Bending Energy Alignment)	Roughly equal to other semi-landmark and Fourier methods	Minimizes bending energy of TPS during sliding	Variable Number of PC Axes (optimized for cross-validation)
Semi-Landmark (Perpendicular Projection)	Roughly equal to other semi-landmark and Fourier methods	Projects points perpendicular to a baseline	Variable Number of PC Axes (optimized for cross-validation)
Elliptical Fourier Analysis	Roughly equal to semi-landmark methods	Represents outline as Fourier harmonics	Variable Number of PC Axes (optimized for cross-validation)
Extended Eigenshape Analysis	Roughly equal to semi-landmark methods	Captributes outline shape using eigenvectors	Variable Number of PC Axes (optimized for cross-validation)

Detailed Experimental Protocol

Protocol 2.1: Outline Digitization and Semi-Landmark Analysis for Feathers

Objective: To capture feather outline shape using semi-landmarks and perform age-class discrimination using Canonical Variates Analysis.
Materials: High-resolution images of rectrices; image processing software (e.g., tpsDig2, R); statistical software with geometric morphometrics capabilities (e.g., R packages geomorph or Morpho) [1] [28].
Procedure:
- Image Acquisition: Obtain standardized, high-resolution images of feathers placed on a contrasting background [28].
- Outline Digitization:
  - Option A (Template-based): Digitize points at the intersections of predefined radii (e.g., from a circle) with the feather outline [28].
  - Option B (Manual Tracing): Manually trace the feather outline, selecting points by eye to define the curve [28].
- Semi-Landmark Alignment: Align the digitized points using a semi-landmark method:
  - Bending Energy Minimization (BEM): Slide the semi-landmarks to minimize the bending energy of the thin-plate spline that deforms the specimen to a consensus shape [28].
  - Perpendicular Projection (PP): Project the semi-landmarks onto a tangent direction to the mean shape, often perpendicular to a baseline or tangent to the curve [28].
- Dimensionality Reduction: Perform Principal Component Analysis (PCA) on the aligned coordinate data. Use a cross-validation procedure to determine the optimal number of PC axes that maximizes the cross-validation rate of correct assignment in the subsequent CVA [28].
- Classification: Conduct a Canonical Variates Analysis (CVA) on the retained PC scores to derive functions that best discriminate between pre-defined age classes [28].
Critical Steps:
- The number of points used to represent the curve was not a highly critical factor for classification success [28].
- Using cross-validation to determine the number of PC axes, rather than a fixed number, produced higher cross-validation assignment rates and is strongly recommended [28].

Diagram 2: Workflow for Feather Outline Analysis and Age Classification

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Software and Tools for Semi-Landmark Based Research

Tool/Reagent	Type	Primary Function	Application Context
3D Slicer with SlicerMorph	Software Platform	Open-source visualization and analysis; core platform for implementing semi-landmarking protocols [1].	3D cranial morphology (Apes, Infant sutures) [1] [30]
R Package `Morpho`	Software Library	Geometric morphometric analysis; includes algorithms for sliding semi-landmarks and statistical shape analysis [1].	General shape analysis, including cranial and feather outlines
R Package `geomorph`	Software Library	Geometric morphometric analysis of landmark configurations; provides tools for Procrustes analysis and visualization [1].	General shape analysis, including cranial and feather outlines
Thin-Plate Spline (TPS)	Mathematical Algorithm	Interpolation and deformation function used to define transformations between landmark configurations and project points [1].	Patch-TPS method, surface warping, and area calculation [1] [30]
Gaborized Outlines	Stimulus Type	Outlines composed of Gabor elements used to study contour integration and figure-ground segmentation in perception [31].	Investigating role of object familiarity in perceptual grouping [31]
Cross-Validation	Statistical Protocol	Resampling method to assess how the results of a statistical analysis will generalize to an independent dataset [28].	Optimizing classification rates in CVA for feather analysis [28]

Optimizing Your Analysis: Parameter Selection, Error Avoidance, and Best Practices

In outline-based identification research, the quantification of shape using geometric morphometrics has been revolutionized by the use of semilandmarks. These points allow for the capture of shape along curves and surfaces where traditional, homologous landmarks are sparse or absent [2]. However, a central challenge in applying these methods is selecting an appropriate density of semilandmarks. This choice is critical, as it directly influences the balance between capturing meaningful biological signal and introducing statistical noise. Under-sampling a structure risks omitting morphologically significant information, leading to an inability to detect genuine shape differences [20]. Conversely, over-sampling not only increases data collection time and reduces computational efficiency but also diminishes statistical power by introducing extraneous information [20]. This application note provides a structured protocol, framed within a thesis on semilandmark alignment methods, to guide researchers in determining an optimal semilandmark density for their specific outline-based studies.

Theoretical Framework: The Semilandmark Dilemma

Semilandmarks are points that quantify the geometry of curves and surfaces between traditional landmarks [2]. Unlike landmarks, which are considered homologous across specimens, the initial placement of semilandmarks is often arbitrary, guided by algorithms rather than strict biological homology [15]. Their final positions are established through a "sliding" process that minimizes either bending energy or Procrustes distance, thereby establishing geometric correspondence across specimens [2] [4].

The fundamental dilemma in using semilandmarks lies in their density. A configuration with too few points will fail to represent the complexity of the biological form, while an excessively dense configuration will capture non-biologica l noise and reduce the degrees of freedom in subsequent analyses [4] [20]. The optimal density is therefore one that is sufficient to capture the relevant morphological variation for a given research question without being wasteful or detrimental to statistical inference [20].

Quantitative Assessment of Coordinate Density

Watanabe's Landmark Sampling Criterion

A quantitative method for estimating the optimal number of points is Watanabe's Landmark Sampling Criterion, which assesses the impact of point number on the accuracy of shape representation [20]. The procedure involves initially digitizing a subset of specimens with a very high density of points, effectively creating an "over-sampled" template. This template is then sub-sampled to create configurations with progressively fewer points. For each level of sub-sampling, a Procrustes ANOVA is performed, and the resulting Procrustes variances are compared to that of the over-sampled configuration.

The optimal density is identified as the point where the Procrustes variance plateaus, indicating that adding more points no longer captures meaningful additional shape variation [20]. An example of this approach, applied to the human os coxae, is summarized in Table 1.

Table 1: Results of Coordinate Density Estimation for the Human Os Coxae (adapted from [20])

Number of Coordinate Points	Procrustes Variance (Relative to Over-sampled Template)	Interpretation
609 (Preliminary Template)	1.00 (Baseline)	Over-sampled reference
400	~1.00	Plateau reached
300	~1.00	Plateau maintained
200	>1.00	Initial signs of signal loss

This study concluded that for the os coxae, a template of 200 points was sufficient to capture major shape differences, while a density of 300 points was recommended for detecting more subtle morphological patterns [20].

Consequences of Density Choices on Statistical Outcomes

The choice of semilandmark density can influence downstream morphometric analyses. Different semilandmarking approaches, which inherently produce points at varying densities and locations, can lead to differences in statistical results [15]. While non-rigid semilandmarking methods tend to be more consistent with each other, any analysis utilizing semilandmarks should be interpreted with the understanding that the results are an approximation of biological reality, and the chosen methodology contributes a source of potential error [15].

Experimental Protocol for Determining Semilandmark Density

This protocol provides a step-by-step guide for determining the optimal semilandmark density for a new morphological structure, using Watanabe's criterion.

The following diagram illustrates the key stages of the protocol from initial template creation to the final determination of the optimal semilandmark density.

Materials and Equipment

Table 2: Research Reagent Solutions and Essential Materials

Item Name	Function/Application	Example Specifications
3D Surface Scanner	To create high-resolution digital models of specimens for digitization.	Structured-light scanner (e.g., Artec Eva) [20].
Digitization Software	To place landmarks and semilandmarks on 3D mesh models.	Viewbox 4, geomorph R package [6] [20].
Statistical Computing Environment	To perform Procrustes superimposition, ANOVA, and data analysis.	R statistical environment [20].
High-Performance Workstation	To handle computational load of processing high-density 3D meshes and data.	Adequate RAM and GPU for large geometric datasets.

Step-by-Step Procedure

Create an Over-sampled Template: Design a preliminary digitization template that deliberately over-sample the morphological structure of interest. This template should include all traditional landmarks as well as a high density of curve and surface semilandmarks, based on prior research or pilot observations [20]. For the human os coxae, an initial template contained 609 points (25 landmarks, 159 curve semilandmarks, 425 surface semilandmarks) [20].
Apply Template to a Representative Subset: Apply this over-sampled template to a randomly selected subset of specimens from your study population. A subset of 5-10 specimens is often sufficient for this initial calibration [20].
Generate Sub-sampled Configurations: Systematically create lower-density configurations from your over-sampled data. This can be achieved by deleting every nth point from the curves and surfaces in the template, creating a series of datasets with progressively fewer points [20].
Perform Procrustes Alignment and ANOVA: For each of the sub-sampled datasets, perform a full Procrustes alignment. Then, conduct a Procrustes ANOVA for each density level to calculate the Procrustes variance [20].
Identify the Variance Plateau: Plot the Procrustes variance against the number of coordinate points for all density levels. The optimal density is identified at the point where the Procrustes variance curve begins to plateau, meaning that adding more points does not meaningfully increase the captured shape variance [20].
Validate with Full Dataset: Once an optimal density is determined, create a new, final template with this number of points and apply it to the entire study sample for subsequent analysis.

Decision Framework for Application

The following flowchart guides the researcher in making key decisions during the density selection process, particularly when a formal quantitative assessment is not feasible.

Concluding Recommendations

Selecting a semilandmark density is a critical step that balances biological insight with statistical and computational practicality. There is no universal optimal number; density must be empirically justified for each research context [20]. The following key recommendations should guide researchers in outline-based identification:

Justify Your Choice: The chosen density should be based on prior literature for similar structures or on a pilot study using the quantitative protocol outlined above.
Context is Key: The required density depends on the morphological complexity of the structure and the specific research question. Detecting subtle shape changes requires a higher density than characterizing gross morphological differences.
Acknowledge the Approximation: Recognize that all semilandmark configurations, regardless of density, provide an approximation of form. Results should be interpreted with an understanding of the methodological choices involved [15].

By adopting this rigorous and systematic approach to selecting semilandmark density, researchers can enhance the reliability, reproducibility, and biological validity of their geometric morphometric studies.

In the field of outline-based identification research, particularly in biology and drug development, geometric morphometrics (GM) has emerged as a crucial technique for the quantitative analysis of shape variation [17]. The application of semi-landmarks is fundamental to this process, enabling the quantification of shapes along smooth curves and surfaces where traditional landmarks are insufficient [15] [32]. The process of sliding semi-landmarks to their optimal positions—the sliding relaxation state—is iterative in nature and critical for minimizing bias and ensuring biological relevance in the resulting shape data [15] [32]. This document details application notes and protocols for determining this optimal state, framed within a broader thesis on semi-landmark alignment methods.

The core challenge addressed herein is that the final configuration of semi-landmarks is not inherent to the specimen but is algorithmically determined [15]. Different sliding approaches and iteration parameters produce different point locations, which subsequently influence statistical results and biological interpretations [15]. Therefore, establishing a standardized, yet flexible, experimental protocol is essential for obtaining reproducible and meaningful results in identification research.

Theoretical Background: Semi-Landmarks and Relaxation

The Role of Semi-Landmarks in Morphometrics

In geometric morphometrics, landmarks are matched points that define a map of point equivalences across samples [15]. Type I landmarks are points of clear biological or anatomical significance, such as the junction between bones, which can be precisely and consistently identified [17]. However, many biological structures, such as the mouse baculum or complex cranial vaults, possess smooth surfaces with few such discrete points [15] [32]. To capture the shape of these outlines and surfaces, semi-landmarks are utilized.

Semi-landmarks are points that are not precisely located at anatomically well-defined locations but are placed along curves or surfaces to capture additional shape information [15]. They are "slid" to minimize their deviation from a mean shape, thus establishing geometric correspondence across specimens [32]. This process is vital for outlining complex shapes where fixed landmarks are insufficient [17].

The Iterative Relaxation Process

The sliding of semi-landmarks is an iterative optimization process. The goal is to minimize a bending energy function or the Procrustes distance between the specimen and a reference (often the sample mean) [15]. In this process, the semi-landmarks are allowed to slide along tangent directions to the curve or surface, thereby removing the positional noise that arises from their arbitrary initial placement while preserving the essential geometric information of the form [32].

Two primary algorithmic approaches guide this relaxation:

Minimization of Bending Energy: This approach uses the thin-plate spline (TPS) interpolation function. It effectively gives greater weight to landmarks and semi-landmarks that are local to the point being slid, providing a more localized correction [15].
Minimization of Procrustes Distance: This method aims to minimize the overall squared differences between the specimen and the reference after Procrustes superimposition. It can be influenced by all landmarks in the configuration, even those distant from the semi-landmark being adjusted [15].

The choice between these methods and the number of iterations performed are critical parameters that define the optimal sliding relaxation state—the configuration that best represents the biological shape of the specimen without being confounded by arbitrary placement error.

Comparative Analysis of Sliding Approaches

A comparative study assessed the performance of three landmark-driven semi-landmarking approaches using two different surface mesh datasets: ape crania and human heads [15]. The findings revealed that while different approaches produced different semilandmark locations, which in turn led to differences in statistical results, the non-rigid semilandmarking approaches showed greater consistency with each other [15].

Table 1: Comparison of Semilandmarking Approaches

Approach	Key Principle	Control Points	Consistency with Landmark-Based Analyses	Primary Use-Case
Bending Energy Minimization	Minimizes the thin-plate spline bending energy between specimen and reference.	Landmarks	High	Surfaces and curves with clear landmark guides; localized shape change analysis.
Procrustes Distance Minimization	Minimizes the Procrustes distance between specimen and reference.	Landmarks	Moderate	Overall shape analysis where global differences are of primary interest.
Iterative Closest Point (ICP)	Rigidly registers a template to a target by iteratively minimizing point-to-point distances.	None (Landmark-free)	Low	Rapid, automated processing of surfaces with high geometric similarity and low shape variation [15].
Conformal Mapping	Establishes point correspondences by conformally mapping 3D meshes to a 2D domain (sphere or disk).	None (Landmark-free)	Variable (Sensitive to surface quality)	Complex topologies and genus-zero surfaces where non-rigid matching is required [15].

The study concluded that morphometric analyses using semilandmarks must be interpreted with caution, recognizing that error is inevitable and that results are approximations of reality [15]. The optimal sliding relaxation state is therefore not a single universal truth but a configuration that is fit-for-purpose based on the biological question, the morphology of the structure, and the required precision.

Experimental Protocol: Achieving Optimal Relaxation

This protocol provides a detailed methodology for conducting sliding relaxation experiments on biological specimens, using a pipeline adapted from studies on the mouse baculum and fish morphology [32] [17].

Specimen Preparation and Imaging

Materials:

Biological specimens (e.g., mouse bacula, fish)
MicroCT scanner (e.g., Scanco Medical AG uCT50) or high-resolution digital camera
Solid-colored, non-reflective background
Software: Python with dicom module; R with rgl package [32]

Procedure:

Specimen Preparation: Prepare specimens to ensure integrity and remove obscuring soft tissue. For internal structures, this may involve dissection and careful cleaning [32].
Image Acquisition: Capture high-resolution images or scans.
- For 3D microCT scanning: Embed specimens in florist foam to minimize interference. Use settings appropriate for the specimen size (e.g., 90 kVp, 155 µA, 15.5 µm voxel size for mouse bacula) [32].
- For 2D photography: Fix the camera perpendicular to the ground. Position the specimen horizontally on a solid-colored background. Use macro mode and save images in JPEG format (2-10 MB file size is often appropriate) [17].
Image Processing: Convert image stacks into point clouds.
- For microCT .DCM stacks, use a Python script to convert slices into a single .xyz file, applying a pixel threshold (e.g., >3000 for bone) [32].
- Use an R script (02_segment_dicoms.r) to segment and label individual specimens from the combined .xyz file, saving a separate file for each [32].

Initial Landmark and Semi-Landmark Digitization

Materials:

Software: tpsDig2 [17]

Procedure:

Define Type I Landmarks: In tpsDig2, manually place a set of conserved Type I anatomical landmarks (e.g., tip of the nose, corner of the eye) on all specimens in the dataset [17].
Place Semi-Landmarks: Define curves and surfaces between the fixed landmarks. For 2D outlines, place semi-landmarks at equidistant points along the outline between fixed landmarks. For 3D surfaces, place a grid of semi-landmarks across the surface of interest [32] [17].

The Iterative Sliding Process

Materials:

Software: MorphoJ, R with geomorph package [17]

Procedure:

Select a Sliding Algorithm: Choose an optimization criterion, typically either Procrustes distance or bending energy minimization [15].
Set Convergence Parameters: Define the convergence criterion (e.g., a tolerance level such as 0.0001) or a maximum number of iterations (e.g., 100) to prevent infinite loops in case of slow convergence.
Execute the Sliding Algorithm: Run the Generalized Procrustes Analysis (GPA) with the sliding semi-landmarks option enabled in your chosen software (e.g., gpagen in the geomorph R package). The software will iteratively: a. Perform a Procrustes superimposition of all specimens using the current landmark and semi-landmark configuration. b. Compute the consensus (mean) shape. c. "Slide" each semi-landmark along its tangent direction to minimize the chosen criterion (bending energy or Procrustes distance) relative to the consensus. d. Check for convergence. If the change in the criterion value is below the tolerance, the process stops. Otherwise, it repeats from step a.

Determining the Optimal Relaxation State

Materials:

Software: R or MorphoJ for statistical analysis [17]

Procedure:

Track Criterion Value: For each iteration, record the value of the minimized criterion (bending energy or Procrustes distance).
Analyze Shape Variance: Calculate the Procrustes variance (the sum of squared Procrustes distances from each specimen to the mean) for the dataset at different iteration checkpoints (e.g., after 1, 5, 10, 50, and 100 iterations).
Plot Convergence: Create a line plot of the minimized criterion value and Procrustes variance against the iteration number. The optimal relaxation state is identified at the point where these curves plateau, indicating that further iterations yield negligible improvement.
Validate Biologically: Use the landmark configuration from the optimal state in a subsequent biological analysis (e.g., a Principal Component Analysis to distinguish known groups). The configuration that produces the most biologically interpretable and statistically robust results is considered functionally optimal.

Research Reagent Solutions

The following table details the essential digital tools and materials required for the outlined experimental protocol.

Table 2: Essential Research Reagents and Software Toolkit

Item Name	Function/Application	Specification/Example
MicroCT Scanner	High-resolution 3D imaging of internal or external structures.	Scanco Medical AG uCT50; settings: 90 kVp, 155 µA, 15.5 µm voxel size [32].
High-Resolution Camera	2D image acquisition for lateral or en-face views.	Capable of macro mode, producing 2-10 MB JPEG files on a solid-colored background [17].
TPS Software Suite	Digitization of landmarks and semi-landmarks, file management, and relative warp analysis.	`tpsDig2` (digitizing), `tpsUtil` (file management), `tpsRelw` (sliding semi-landmarks) [17].
R Statistical Environment	Comprehensive platform for geometric morphometric analysis, visualization, and statistical testing.	R packages: `geomorph` (GPA & sliding), `Morpho`, `Momocs` (outline analysis) [17].
MorphoJ	Integrated software for GM, performing GPA, PCA, discriminant analysis, and visualization.	Version 1.08.01; user-friendly GUI for common GM analyses [17].
ImageJ	Open-source image processing for format conversion, scaling, and background removal.	Version 1.54i; used with AI-based background remover tools for image extraction [17].

Workflow Visualization

The following diagram illustrates the integrated experimental and analytical workflow for achieving the optimal sliding relaxation state.

Workflow for Optimal Sliding Relaxation

The process of sliding semi-landmarks to an optimal relaxation state is a critical, iteration-dependent step in geometric morphometrics. The choice of sliding algorithm and the determination of convergence directly influence the resulting shape data and all subsequent biological inferences [15]. The protocols and analyses provided here offer a structured framework for researchers to systematically approach this problem, balancing computational efficiency with biological fidelity. By adhering to such standardized methodologies, the field of outline-based identification can enhance the reproducibility and robustness of its findings, ultimately advancing research in areas ranging from evolutionary biology to pharmaceutical development.

Initial Template Selection and Its Impact on Results

In geometric morphometrics, the analysis of biological shape often relies on landmarks and semilandmarks to quantify form. Initial template selection is a critical step in studies utilizing sliding semilandmark approaches, particularly when applying surface semilandmarks through a template-based registration process [2]. This choice establishes the reference configuration onto which all target specimens are mapped, thereby influencing the quantification of shape variation across the entire dataset [33]. The template serves as the foundation for establishing geometric homology—the point-to-point correspondence between specimens—which is essential for meaningful biological comparisons [15].

Within the context of outline-based identification research, the impact of template selection extends beyond technical reproducibility to influence downstream biological interpretations. When a single template is used to capture highly disparate forms, the registration algorithm must reconcile potentially substantial shape differences, which can result in reduced accuracy for specimens morphologically distant from the template [33]. This limitation becomes particularly pronounced in evolutionary studies encompassing broad taxonomic spans, where morphological variation can be extreme [11]. Recent investigations have demonstrated that template choice can introduce systematic biases in subsequent morphospace occupation, potentially affecting estimates of morphological disparity and evolutionary rates [11].

Experimental Design and Comparative Approaches

Single-Template versus Multi-Template Strategies

Conventional automated landmarking methods often rely on a single-template approach, where one specimen serves as the reference for registering all others in the dataset [33]. While computationally efficient, this method faces significant limitations when applied to morphologically variable samples. The accuracy of registration depends on how well the algorithm can optimize the cost function while accommodating local shape differences, a task that becomes increasingly challenging as the morphological gap between template and target widens [33].

Multi-template methods offer a robust alternative by utilizing several templates that collectively represent the morphological diversity within the sample. The MALPACA pipeline exemplifies this approach, where multiple templates are used to landmark each target specimen independently [33]. For each landmark, the final coordinate is determined by taking the median estimate across all templates, thereby reducing the bias inherent in relying on a single reference specimen [33]. This approach has demonstrated significantly improved performance over single-template methods, particularly for datasets encompassing multiple species with substantial shape variation [33].

Template Selection Methodologies

The strategy for selecting templates significantly influences the effectiveness of multi-template approaches. When prior knowledge of morphological variation is available, researchers can manually select templates to represent major morphological groups or extremes [33]. However, for exploratory studies without such prior information, algorithmic selection methods provide valuable alternatives.

The K-means clustering approach offers a data-driven solution for template selection without requiring a priori morphological knowledge [33]. This method involves:

Performing Generalized Procrustes Analysis (GPA) on sparse point clouds representing all specimens
Conducting Principal Component Analysis (PCA) on the Procrustes-aligned coordinates
Applying K-means clustering to the PC scores representing overall shape variation
Selecting specimens closest to cluster centroids as templates [33]

This method ensures that selected templates comprehensively capture the major axes of shape variation within the dataset, providing a representative basis for subsequent landmarking.

Table 1: Comparison of Template Selection Strategies

Strategy	Methodology	Advantages	Limitations	Best Applications
Single Template	One specimen landmarks all targets	Computational efficiency, simplicity	Poor performance with high variation, introduces bias	Intraspecific studies, low-disparity samples
Multi-Template (Manual)	Researcher selects templates based on prior knowledge	Incorporates expert knowledge, biologically informed	Requires prior knowledge of variation, potentially subjective	Well-studied groups, distinct morphological clusters
Multi-Template (K-means)	Algorithm selects templates to capture variance	Data-driven, objective, comprehensive coverage	Requires initial landmarking for clustering, computational overhead	Exploratory studies, highly variable datasets
Deterministic Atlas	Iteratively estimates optimal mean shape	No fixed template, adapts to sample	Computationally intensive, complex implementation	Large-scale macroevolutionary studies [11]

Quantitative Assessment of Template Impact

Performance Metrics in Morphometric Studies

The influence of initial template selection can be quantified using several performance metrics that compare automated landmarking results to manually placed "gold standard" landmarks [33]. The Root Mean Square Error (RMSE) measures the average distance between estimated and manual landmark positions, providing a global assessment of landmarking accuracy [33]. Additionally, examining errors at individual landmark locations helps identify regional patterns of inaccuracy that may arise from poor template-target correspondence [33].

Beyond direct coordinate comparison, researchers should evaluate how template choice affects downstream morphometric analyses. This includes assessing correlations in Procrustes distances between specimens, preservation of morphospace structure derived from Principal Component Analysis, and consistency in centroid size calculations [33]. These measures ensure that template selection does not fundamentally alter biological interpretations of shape variation and relationships.

Empirical Evidence of Template Effects

Recent studies have provided quantitative evidence of template impact across diverse taxonomic groups. In a study of mouse crania (61 specimens) and great ape crania (52 specimens), the multi-template approach (MALPACA) significantly outperformed single-template methods, reducing RMSE by up to 30% compared to manual landmarking [33]. The K-means template selection method consistently avoided the worst-performing template combinations when compared to random selection, demonstrating its value for optimizing landmarking accuracy [33].

In macroevolutionary studies of 322 mammalian crania, the Deterministic Atlas Analysis (DAA) approach—which iteratively estimates an optimal mean shape rather than relying on a fixed template—showed strong correlation with traditional landmarking methods after addressing issues of mesh modality [11]. However, systematic biases emerged for certain taxonomic groups (Primates and Cetacea), highlighting how template-driven methods can differentially capture shape across disparate morphologies [11].

Table 2: Template Selection Impact on Landmarking Accuracy

Study System	Template Approach	Performance Outcome	Key Findings
Mouse crania (61 specimens) [33]	Single-template vs. Multi-template (MALPACA)	Multi-template significantly reduced RMSE	K-means template selection outperformed random selection
Great ape crania (52 specimens) [33]	Single-template vs. Multi-template (MALPACA)	Multi-template more accurate for interspecific variation	Species-specific templates further improved accuracy
Mammalian crania (322 specimens) [11]	Deterministic Atlas vs. Manual landmarks	Strong correlation after mesh standardization	Systematic biases for certain taxa (Primates, Cetacea)
Great ape crania (3 species) [1]	Patch, Patch-TPS, Pseudo-landmark	All methods comparable to manual landmarks	Patch method sensitive to noise and missing data

Practical Protocols for Template Selection and Evaluation

K-means Multi-Template Selection Protocol

For researchers implementing the K-means template selection method, the following detailed protocol ensures optimal results:

Preliminary Data Preparation: Generate 3D surface models (PLY, STL, or equivalent formats) for all specimens in the dataset. Ensure consistent mesh quality and resolution across specimens [33].
Sparse Landmarking: Manually place a minimal set of biologically homologous landmarks on all specimens. These should capture major anatomical features and provide a basic correspondence for initial alignment [33].
Generalized Procrustes Analysis: Perform GPA alignment on the sparse landmark configurations using standard software (e.g., Morpho, geomorph R packages) to remove non-shape variation [33].
Shape Space Characterization: Conduct PCA on the Procrustes-aligned coordinates to represent shape variation. Retain sufficient PCs to capture at least 95% of cumulative variance [33].
K-means Clustering: Apply K-means clustering to the PC scores. The optimal number of clusters (k) can be determined using the elbow method or gap statistic. As a practical guideline, 3-5 templates often suffice for moderately variable datasets, while highly disparate samples may require more [33].
Template Identification: For each cluster, identify the specimen closest to the centroid. These centroid specimens serve as optimal templates representing the major morphological modes in the dataset [33].
Template Landmarking: Manually place the complete landmark set (including semilandmarks) on each selected template specimen. These fully landmarked templates are then used in the multi-template automated landmarking pipeline [33].

Template Transfer and Landmarking Workflow

The process of transferring landmarks from templates to target specimens follows a standardized workflow:

Diagram Title: Multi-Template Landmark Estimation Workflow

For each template-target pair, the registration process typically employs non-rigid iterative closest point (ICP) algorithms or thin-plate spline (TPS) warping to align the template to the target specimen [1]. Following initial placement, semilandmarks undergo sliding procedures to minimize either bending energy or Procrustes distance, optimizing their positions along curves or surfaces to better capture biological shape rather than arbitrary spacing [2] [6].

Post-hoc Quality Assessment Protocol

Implementing rigorous quality checks after automated landmarking is essential for validating template selection and identifying potential errors:

Template Convergence Analysis: Compare landmark estimates derived from different templates for the same target specimen. Large discrepancies between templates indicate regions where the registration may be unreliable [33].
Landmark-wise Variance Examination: Calculate the variance of each landmark position across template-derived estimates. High variance suggests landmarks located in morphologically variable regions or areas sensitive to registration errors [33].
Outlier Detection and Removal: Identify landmarks with consistently poor convergence across multiple specimens. These problematic landmarks can be removed or manually corrected to improve overall dataset quality [33].
Biological Plausibility Check: Visually inspect the landmarked specimens to ensure anatomical correspondence, particularly for semilandmarks placed on complex surfaces [15].
Downstream Analysis Comparison: Compare results of preliminary morphometric analyses (e.g., PCA, Procrustes ANOVA) using different template strategies to assess impact on biological conclusions [11].

Table 3: Research Reagent Solutions for Template-Based Morphometrics

Resource Category	Specific Tools/Solutions	Function/Purpose	Implementation Considerations
Software Platforms	3D Slicer with SlicerMorph extension [33] [1]	Open-source platform for 3D visualization, landmarking, and analysis	Cross-platform compatibility; modular architecture
	R packages: geomorph [6], Morpho [1]	Statistical analysis of shape; sliding semilandmarks	Extensive documentation; active developer support
Template Selection Algorithms	K-means clustering [33]	Data-driven template selection without prior knowledge	Requires initial sparse landmarking
	Deterministic Atlas Analysis (DAA) [11]	Landmark-free approach using iterative atlas estimation	Computationally intensive for large datasets
Registration Methods	Thin-Plate Spline (TPS) warping [1]	Non-rigid transformation for landmark transfer	Sensitive to landmark correspondence
	Iterative Closest Point (ICP) variants [15] [1]	Surface-based registration without landmarks	Performance varies with surface complexity
Quality Assessment Tools	Procrustes ANOVA [1]	Quantifying landmarking error and biological signal	Requires repeated measurements
	RMSE calculation against gold standard [33]	Quantitative accuracy assessment	Dependent on reliable manual landmarks

Initial template selection represents a fundamental methodological decision in studies utilizing sliding semilandmark approaches for outline-based identification. The evidence consistently demonstrates that multi-template strategies outperform single-template approaches, particularly for morphologically variable datasets spanning broad taxonomic ranges [33]. The development of data-driven template selection methods, such as K-means clustering, provides researchers with objective approaches for optimizing landmarking accuracy without requiring extensive prior knowledge of morphological variation [33].

Future methodological developments should focus on enhancing the automation of template selection while maintaining biological interpretability. Incorporating landmark-free approaches like Deterministic Atlas Analysis shows promise for large-scale macroevolutionary studies, though challenges remain in ensuring consistent performance across highly disparate morphologies [11]. Regardless of the specific method chosen, researchers should implement robust post-hoc quality assessments to evaluate the influence of template selection on their specific biological conclusions, recognizing that all semilandmark approaches represent approximations of biological reality that require cautious interpretation [15].

Addressing Challenges with Noise, Missing Data, and Mixed Modality Datasets

In outline-based identification research, robust semi-landmark alignment is paramount for quantifying morphological shape. This process is frequently compromised by three pervasive data challenges: noise, missing data, and the complexity of mixed modality datasets. Noise, introduced via imaging artifacts or specimen damage, obscures true biological signals. Missing data, whether from sporadic collection or extensive data loss, can bias statistical estimates and reduce analytical power. Furthermore, integrating mixed modalities—such as combining 2D outlines with 3D scans or genomic data—presents significant hurdles in data fusion and analysis. This application note provides a structured framework and detailed protocols to address these challenges, ensuring the reliability and validity of semi-landmark-based morphological analyses.

Data Quality Assessment and Cleaning Protocols

Handling Missing Data

A critical first step in data preprocessing is the diagnosis and handling of missing values. The mechanism of missingness—Missing Completely at Random (MCAR), Missing at Random (MAR), or Missing Not at Random (MNAR)—informs the appropriate correction strategy [34]. In quantitative research, improper handling can significantly affect data quality, leading to biased model parameter estimates [34].

Protocol 2.1: Diagnosis and Imputation of Missing Data

Objective: To identify patterns of missing data and apply appropriate imputation techniques to preserve dataset integrity and statistical power.
Materials: Dataset with missing values, statistical software (e.g., R, Python).
Procedure:
- Diagnosis: Generate a missingness map to visualize the distribution and patterns of missing data. Determine if the pattern is sporadic, block-based, or mixed [35].
- Mechanism Assessment: Evaluate the likely mechanism of missingness (MCAR, MAR, MNAR) based on data collection procedures [36].
- Imputation Method Selection:
  - For sporadic missingness at low rates (<5%), simple imputation (e.g., mean, median) may be sufficient.
  - For higher rates or complex patterns, use machine learning imputation. The K-Nearest Neighbors (KNN) and Random Forest (RF) algorithms have demonstrated robust performance, even with missing rates up to 30% [35].
  - For longitudinal data, leverage mixed-effects models in multiple imputation to account for within-subject correlations [36].
- Implementation: Execute imputation, creating multiple complete datasets if using multiple imputation.
- Validation: Compare the distributions of the original and imputed data to check for conservation of variance and covariance structures.

Table 1: Comparison of Common Imputation Methods for Morphometric Data

Method	Principle	Best Suited For	Advantages	Limitations
Mean/Median Imputation	Replaces missing values with the variable's mean or median.	MCAR data, very low (<5%) missingness.	Simple and fast to implement.	Severely underestimates variance; distorts covariances.
k-Nearest Neighbors (KNN)	Uses the mean value from 'k' most similar specimens.	MAR data, sporadic or mixed patterns.	Non-parametric; preserves data structure.	Computational cost rises with dataset size.
Random Forest (RF)	Uses an ensemble of decision trees to predict missing values.	MAR/MNAR data, complex patterns, high dimensionality.	Handles non-linear relationships; high accuracy.	Computationally intensive; risk of overfitting.
Multiple Imputation	Creates several complete datasets and combines results.	MAR data, final analysis for statistical inference.	Accounts for uncertainty in the imputation model.	Complex to implement and analyze.

Identifying and Mitigating Noise

Noise in morphometric datasets can arise from various sources, including scanning artifacts, specimen preparation damage, or errors in initial landmark digitization. This noise can mask true biological shape variation.

Protocol 2.2: Noise Reduction for 3D Surface Meshes

Objective: To smooth 3D mesh surfaces without losing biologically relevant morphological features.
Materials: 3D mesh models (e.g., in PLY, STL format), mesh processing software (e.g., 3D Slicer, MeshLab).
Procedure:
- Visual Inspection: Rotate and inspect the 3D model for obvious spikes, holes, or surface roughness.
- Laplacian Smoothing: Apply a Laplacian smoothing algorithm to the mesh. This process adjusts vertex positions to the centroid of their neighbors, effectively "relaxing" the mesh and reducing high-frequency noise [1].
- Parameter Tuning: Use a low iteration count (e.g., 2-5) and a conservative relaxation factor to prevent over-smoothing and loss of important morphological detail.
- Validation: Compare the smoothed mesh to the original, ensuring that homologous anatomical features remain sharp and identifiable. Use landmark-based geometric morphometrics to confirm that smoothing has not introduced bias in inter-landmark distances.

Advanced Semi-Landmarking Strategies for Complex Datasets

Semi-landmarking methods allow for the dense sampling of shapes, but different approaches present trade-offs in correspondence, repeatability, and robustness to noise and missing data [1] [15]. The following protocols detail three advanced strategies.

Protocol 3.1: Patch-Based Semi-Landmarking

Objective: To generate dense semi-landmarks directly on a specimen's surface using manually defined patches.
Materials: Specimen 3D mesh, manually placed anatomical landmarks, software like 3D Slicer with SlicerMorph extension [1].
Procedure:
- Define Patches: Select three manual landmarks that form a triangle over a region of biological interest (e.g., the frontal bone).
- Generate Grid: Create a template triangular grid with a user-specified density within the boundaries of the three landmarks.
- Project to Surface: Warp the grid onto the specimen's surface using a Thin-Plate Spline (TPS) deformation. Project grid points onto the mesh along the average surface normal vector of the three bounding landmarks [1].
- Handle Errors: For points that fail to project, either reverse the projection direction or assign the closest mesh point.
Considerations: This method is sensitive to surface noise and can produce outliers on complex surfaces with sharp edges [1]. It is best for datasets with good initial landmark coverage and low noise.

Protocol 3.2: Patch-Based Semi-Landmarks with Thin-Plate Spline Transfer (Patch-TPS)

Objective: To improve robustness by generating semi-landmarks on a template specimen and propagating them to all other specimens.
Materials: A high-quality template specimen, all target specimens, manual landmarks on all specimens.
Procedure:
- Template Landmarking: Use Protocol 3.1 to place a dense set of semi-landmarks on the template specimen.
- Calculate TPS Transform: For each target specimen, compute a TPS transformation based on the correspondence between its manual landmarks and those of the template.
- Warp Landmarks: Use the TPS transform to warp the template's semi-landmarks onto the target specimen's surface.
- Project and Refine: Project the warped points onto the target's mesh along the template's normal vectors [1].
Considerations: This method is more robust to noise and missing data than the direct patch method. It ensures consistent coverage and correspondence across all specimens but is dependent on the choice of a good template [1].

Protocol 3.3: Pseudo-Landmark Sampling

Objective: To generate a dense, regularly spaced set of points on a template surface with no direct geometric relationship to manual landmarks.
Materials: A template specimen 3D mesh.
Procedure:
- Generate Points: On the template mesh, generate a dense cloud of points regularly sampled across the surface, often by assuming spherical topology.
- Apply Spatial Filter: Remove coincident points to enforce a minimum distance, ensuring even coverage.
- Transfer to Targets: Use a TPS transform (as in Protocol 3.2) to propagate these pseudo-landmarks to all specimens in the dataset [1].
Considerations: This method provides excellent coverage and consistent spacing. However, point correspondence is mathematically, rather than biologically, defined. Analyses using pseudo-landmarks should be interpreted as approximations of overall form [15].

Table 2: Comparison of Semi-Landmarking Strategies

Method	Correspondence	Robustness to Noise	Coverage	Computational Cost	Best for Datasets With
Patch-Based	Geometric (patch-based)	Low	Dependent on manual landmarks	Low	High-quality surfaces, good landmark coverage.
Patch-TPS	Template-driven	Medium-High	Consistent and comprehensive	Medium	Significant shape variation and moderate noise.
Pseudo-Landmark	Mathematical (surface-based)	High	Uniform and dense	High (for initial sampling)	Complex surfaces with few reliable landmarks.

The following workflow diagram illustrates the decision path for selecting and applying the appropriate semi-landmarking strategy.

Semi-landmark Strategy Selection Workflow

Integrating Mixed Modality Data

The fusion of outline or landmark data with other modalities, such as genomic or clinical data, enables a more holistic analysis. Multimodal Artificial Intelligence (MMAI) provides frameworks for this integration [37] [38].

Protocol 4.1: Late Integration for Multimodal Analysis

Objective: To combine predictions from models trained on separate data modalities (e.g., shape and genomics) for a unified outcome.
Materials: Processed datasets from each modality (e.g., Procrustes shape coordinates from landmarks, genomic variant data).
Procedure:
- Train Separate Models: Develop a predictive model (e.g., for species classification or disease subtyping) independently for each data modality.
- Generate Predictions: Use each model to generate a set of predictions or probability scores on the same set of specimens.
- Fuse Predictions: Combine the predictions in a final ensemble model. This can be a weighted average, a stacking algorithm, or another meta-learner [39].
Advantages: This approach is computationally efficient and handles missing modalities well, as models can be trained independently. It is robust to the different scales and structures of each data type [39].

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Software and Analytical Tools

Tool Name	Type/Function	Application in Semi-Landmark Research	Access
3D Slicer / SlicerMorph	Open-source visualization & analysis platform	Core platform for 3D model handling, manual landmarking, and executing patch-based & pseudo-landmarking protocols [1].	Public GitHub Repository
R packages: Morpho & geomorph	Statistical shape analysis in R	Sliding semilandmarks, Procrustes analysis, and statistical testing of shape differences [1] [15].	CRAN
Auto3dgm	Landmark-free correspondence algorithm	Provides an alternative, template-based method for establishing dense point correspondences without manual landmarks [15].	Public Package
Urban Institute R Theme (urbnthemes)	R graphics formatting package	Ensures publication-ready, standardized visualizations of morphometric data and results [40].	GitHub
MICE (Multivariate Imputation by Chained Equations)	Multiple imputation software in R	Handles missing data in multivariate datasets under the MAR assumption, suitable for mixed data types [36].	R Package

Ensuring Repeatability and Minimizing Operator Bias in Landmark Placement

In outline-based geometric morphometrics (GM), the precise placement of landmarks and semilandmarks is fundamental for quantifying shape accurately. Operator bias—the variation introduced by different individuals collecting landmark data—is a significant source of measurement error that can compromise the repeatability and validity of scientific research [41]. This challenge is particularly acute in studies relying on outline-based identification, where homologous points are scarce and the quantification of curves and surfaces is essential [2]. The increasing use of large, collaborative datasets and high-resolution 3D reconstructions underscores the need for robust protocols to minimize this bias [2] [41]. This document provides detailed application notes and protocols to ensure repeatability in landmark placement, framed within a thesis on semi-landmark alignment methods for outline-based identification research.

The Problem: Quantifying Operator Bias in Morphometrics

Measurement error in geometric morphometrics can be partitioned into different components, with inter-operator bias often being the most substantial [41]. A study on 3D landmarks from MRI images found that differences among operators accounted for up to 30% of the total sample shape variation—a magnitude that surpassed the effect of sex differences in a large sample of hundreds of individuals [41]. This highlights that even precise landmarks do not guarantee negligible errors in shape data.

Table 1: Impact of Inter-Operator Bias on Shape and Size (from MRI Landmark Data)

Measurement	Effect of Inter-Operator Bias
Bone Landmark Shape	Up to 30% of total sample variance dominated by operator differences [41]
Nasal Soft-Tissue Size	Relatively larger errors in size estimates [41]
Nasal Soft-Tissue Shape	Higher reproducibility compared to bone landmarks [41]
General Shape vs. Size	Shape is often more affected by bias than size [41]

Strategies for Minimizing Bias and Ensuring Repeatability

Standardization Through Devices and Templates

The use of a Marker Alignment Device (MAD) can significantly improve intra- and inter-rater reliability. One study demonstrated that such a device, which aids subjects in recreating the same posture and recreates anatomical landmarks from previous trials, reduced errors in gait kinematics, particularly for out-of-plane hip and knee movements [42]. For surface sliding semilandmarks, a template-based approach is recommended. Instead of manually placing points on each specimen, a single template of landmarks and curves delimiting boundaries is created and then warped onto subsequent specimens in a semi-automated process [2]. This ensures anatomical correspondence across specimens and minimizes subjective placement.

Strategic Landmark Selection and Types

Traditional Landmarks: Rely on points of clear biological homology (e.g., sutures, processes). These are precise but often insufficient for capturing entire shapes [2].
Sliding Semilandmarks: Used to quantify curves and surfaces where traditional landmarks are lacking. They "slide" along outlines or surfaces during Procrustes alignment to minimize bending energy or Procrustes distance, establishing geometric homology [2] [6]. They are essential for capturing morphological information in outline-based research [2] [9].

Robust Training and Workflow Protocols

Implementing a standardized training protocol for all operators is critical. Furthermore, the entire workflow—from digitization to analysis—should be designed to minimize and account for potential bias. The diagram below outlines a protocol for a repeatable landmarking process.

Diagram 1: A standardized workflow for landmark data collection and processing to minimize operator bias.

Detailed Experimental Protocol: Implementing Sliding Semilandmarks in R

This protocol details the implementation of sliding semilandmarks for an outline-based study, using the geomorph package in R [6].

Objective: To quantify and compare outline shapes of wings using curve semilandmarks. Materials: 2D or 3D digital images of specimens (e.g., insect wings).

Step-by-Step Procedure:

Data Import: Read landmark data stored in the TPS format.
Define Sliding Semilandmarks: Create a matrix specifying the landmarks that form the curves. For instance, for a curve defined by landmarks 2, 3, 5, and 6:
Generalized Procrustes Analysis (GPA): Perform Procrustes alignment, which includes the sliding of semilandmarks to minimize Procrustes distance.
Visualization: Plot the aligned landmarks to visualize the consensus configuration and individual variation.

Assessment of Measurement Error:

A Procrustes ANOVA can be used to quantify the variance components attributable to individual subjects (biological signal) versus operator error.

The goal is for the variance explained by biological signal to be significantly larger than that explained by operator bias.

The Scientist's Toolkit: Essential Reagents and Materials

Table 2: Key Research Reagent Solutions for Outline-Based Geometric Morphometrics

Item	Function/Application
Geomorph R Package	A comprehensive tool for performing geometric morphometric analyses, including GPA and sliding semilandmarks [6].
Marker Alignment Device (MAD)	A physical device to standardize subject posture and landmark recreation across repeated tests, reducing palpation artifact [42].
High-Resolution Scanner (CT/MRI)	Generates high-quality 3D specimen reconstructions for detailed landmark and semilandmark digitization [2].
Standardized Template	A predefined set of landmarks and curves applied to all specimens to ensure anatomical correspondence and reduce subjectivity [2].
WebAIM Color Contrast Checker	A tool to ensure sufficient color contrast in data visualization diagrams, aiding accessibility and clarity for all readers [43] [44].
Sliding Semilandmarks	A method for quantifying curves and surfaces in shapes lacking discrete homologous points, crucial for outline-based studies [2] [6].

Operator bias is a pervasive challenge in geometric morphometrics that can be effectively managed through a combination of technological aids, standardized protocols, and robust statistical practices. By adopting the strategies outlined—such as using alignment devices, template-based semilandmarks, and rigorous error assessment—researchers can significantly enhance the repeatability and reliability of their landmark data. This is especially critical for outline-based identification research and the growing field of semi-landmark alignment methods, where the integrity of the data is paramount for generating valid scientific insights.

Validating and Comparing Methods: Performance Metrics and Biological Interpretation

Within the broader thesis on advancing semi-landmark alignment methods for outline-based identification research, a critical methodological challenge is the selection of an appropriate sliding criterion. This protocol establishes a comparative framework to assess the consistency of morphological outcomes when semi-landmarks are aligned using Bending Energy (BE) minimization versus Procrustes Distance (PD) minimization. These two predominant criteria offer different philosophical and computational approaches to the same problem: optimizing the placement of semi-landmarks along a curve or surface to remove arbitrary effects of their initial placement while preserving biological shape information [1].

The core of this framework is a quantitative, repeatable experiment designed to determine whether the choice of sliding method significantly influences the final geometric morphometric results. This is paramount for ensuring the reliability and comparability of findings across studies in fields like taxonomic identification, where outline-based methods are frequently applied [9]. This document provides detailed application notes and step-by-step protocols for conducting this comparative assessment.

Experimental Design and Comparative Framework

Conceptual Basis of Sliding Criteria

Semi-landmarks are crucial for capturing the shape of homologous curves and surfaces where defining discrete anatomical landmarks is challenging. However, their initial placement does not constitute a homologous correspondence across specimens. Sliding algorithms are therefore employed to establish a more biologically meaningful correspondence. The choice of criterion for this optimization imposes different constraints on the resulting shape configuration [1].

Bending Energy (BE) Minimization: This criterion is based on the metaphor of a metal sheet. It minimizes the energy required to bend one shape into another, effectively favoring a deformation that is as smooth as possible. The BE-minimized alignment of semi-landmarks corresponds to the thin-plate spline interpolation between specimens [1].
Procrustes Distance (PD) Minimization: This criterion seeks a more direct geometric fit. It minimizes the sum of squared distances between corresponding semi-landmarks after Procrustes superposition. This approach aims to find the configuration that requires the least movement of points to achieve alignment [1].

The fundamental question this framework addresses is whether the different theoretical foundations of BE and PD lead to statistically and practically significant differences in the final shape data used for downstream analysis (e.g., Procrustes ANOVA, PCA, discriminant analysis).

The following workflow diagrams the core process for comparing the two sliding criteria, from data preparation to statistical evaluation.

Detailed Experimental Protocols

Protocol 1: Data Preparation and Initial Landmarking

Objective: To generate a baseline dataset of raw landmark and semi-landmark coordinates from a set of biological outline images (e.g., insect wings, leaf outlines, cranial sutures).

Materials:

Image Set: High-resolution 2D outline images or 3D surface models. Sample size (N) should be sufficient for planned statistical power.
Digitization Software: A platform capable of placing landmarks and semi-landmarks (e.g., 3D Slicer with SlicerMorph extension [1], tpsDig2, MorphoJ).
File Format: Data should be exportable in a standard format (e.g., .TPS, .NTS) for analysis in R/Python.

Procedure:

Define Homologous Landmarks: Identify and place a set of Type I or Type II anatomical landmarks that are biologically homologous across all specimens. These will serve as fixed anchors.
Place Semi-Landmarks: For outline data, define a homologous curve between two fixed landmarks. Place n semi-landmarks along this curve using an equidistant sampling scheme. For surface data, use a method like patch-based sampling or pseudo-landmark sampling to generate a dense cloud of points [1].
Create Template: Designate one specimen as a template. The semi-landmark configuration on this template will be used to initialize the sliding process for all other specimens.
Data Export: Export the raw coordinates of all fixed landmarks and semi-landmarks for all specimens. This is the "un-slid" dataset that will serve as input for Protocol 2.

Protocol 2: Parallel Sliding of Semi-Landmarks

Objective: To generate two aligned shape datasets by sliding the semi-landmarks according to the BE and PD minimization criteria.

Materials:

Software: A statistical environment with geometric morphometrics packages (e.g., R with the geomorph [1] and Morpho [1] packages).
Input Data: The "un-slid" dataset from Protocol 1.

Procedure:

Load Data: Import the raw landmark data into R using a function like geomorph::readland.shapes.
Sliding for BE:
- Use the geomorph::gpagen function, specifying ProcD = FALSE.
- This function will slide the semi-landmarks by minimizing the bending energy of the thin-plate spline transformation between each specimen and the consensus shape.
- Save the resulting Procrustes-aligned coordinates as a new object (e.g., GPA.BE).
Sliding for PD:
- Use the geomorph::gpagen function again on the same raw data, but specify ProcD = TRUE.
- This function will slide the semi-landmarks by minimizing the squared Procrustes distance between each specimen and the consensus.
- Save the resulting Procrustes-aligned coordinates as a new object (e.g., GPA.PD).
Data Storage: Ensure both GPA.BE and GPA.PD objects contain the Procrustes coordinates, which are now directly comparable for shape analysis.

Protocol 3: Comparative Consistency Analysis

Objective: To quantitatively compare the two shape datasets (GPA.BE and GPA.PD) and assess the consistency of results across sliding criteria.

Materials:

Input Data: The GPA.BE and GPA.PD objects from Protocol 2.
Software: R with geomorph, vegan, and Morpho packages.

Procedure:

Procrustes Distance Correlation:
- Calculate the pairwise Procrustes distance matrix from the GPA.BE coordinates and the GPA.PD coordinates.
- Perform a Mantel test (e.g., using vegan::mantel) to compute the correlation (r) between the two distance matrices. A high correlation (e.g., r > 0.95) suggests the relative differences among specimens are consistent across methods.
Principal Component Analysis (PCA) Comparison:
- Perform a PCA on the Procrustes coordinates from the BE-aligned data (gm.prcomp(GPA.BE$coords)).
- Perform a PCA on the Procrustes coordinates from the PD-aligned data (gm.prcomp(GPA.PD$coords)).
- Compare the variance explained by the leading principal components (PCs) and the specimen distribution in the morphospace defined by the first few PCs.
Statistical Testing of Group Differences:
- If the study involves groups (e.g., species, treatments), run a Procrustes ANOVA (e.g., geomorph::procD.lm) on the PC scores from both the BE and PD datasets to test for group effects.
- Compare the resulting p-values and effect sizes (e.g., Z-score) for the group factor between the two criteria.
Landmark Displacement Visualization:
- Calculate the mean shape for the BE and PD datasets.
- Visualize the vector displacement between the mean shapes using a deformation grid (e.g., plotRefToTarget in geomorph) to identify localized regions where the two criteria produce systematically different results.

Data Presentation and Interpretation

Quantitative Comparison of Sliding Criteria

Table 1: Key metrics for assessing consistency between Bending Energy (BE) and Procrustes Distance (PD) sliding criteria. This template should be populated with results from a real analysis.

Comparison Metric	Bending Energy (BE)	Procrustes Distance (PD)	Measure of Agreement	Interpretation
Matrix Correlation (Mantel Test `r`)	N/A	N/A	e.g., 0.98	High correlation indicates the relative shape differences among specimens are preserved.
Variance on PC1 (%)	e.g., 45.2%	e.g., 44.8%	Δ = 0.4%	Minimal difference in major axis of variation.
Group Separation `p-value`	e.g., 0.003	e.g., 0.005	Δ = 0.002	Consistent statistical conclusion regarding group effect.
Mean Procrustes Distance to Consensus	e.g., 0.048	e.g., 0.047	Δ = 0.001	Nearly identical overall dispersion.

Interpretation Guidelines

High Consistency: If the Mantel correlation is high (e.g., >0.95), the leading PCs are similar, and statistical inferences (e.g., group differences) are congruent, then the choice of sliding criterion has a negligible impact on the biological conclusions. In such cases, either method is valid.
Low Consistency: If the correlation is low and statistical inferences differ, the sliding criterion is a major source of variation. Investigate further by:
- Inspecting the displacement visualization to see if differences are localized to specific anatomical regions with low landmark density or high curvature.
- Increasing the density of semi-landmarks in problematic regions to see if results converge.
- Reporting both analyses and discussing the sensitivity of the findings to the methodological choice.

The Scientist's Toolkit

Table 2: Essential software and R packages for implementing the comparative framework for semi-landmark analysis.

Tool / Reagent	Type	Primary Function	Application in Protocol
3D Slicer / SlicerMorph [1]	Software Platform	3D visualization, image data processing, and landmark digitization.	Protocol 1: Digitizing fixed landmarks and semi-landmarks on 3D surface models.
R Statistical Environment	Software Platform	Core computing environment for statistical analysis and visualization.	Protocols 2 & 3: All data manipulation, sliding algorithms, and statistical comparisons.
`geomorph` R Package [1]	Software Library	Comprehensive toolset for geometric morphometrics.	Protocol 2: Sliding semi-landmarks (`gpagen`). Protocol 3: PCA and Procrustes ANOVA (`procD.lm`).
`Morpho` R Package [1]	Software Library	Complementary tools for shape analysis and processing.	Protocol 2: Alternative sliding algorithms. Protocol 3: Visualization (e.g., deformation grids).
`vegan` R Package	Software Library	Multivariate statistical methods.	Protocol 3: Performing the Mantel test to compare distance matrices.
Thin-Plate Spline (TPS) Transform [1]	Mathematical Model	A smooth interpolation function for mapping points from one shape to another.	Underpins the Bending Energy sliding criterion and is used in template-based landmarking.

Evaluating Classification Accuracy and Allometric Scaling Across Methods

Application Notes

Core Concepts and Relevance

This document provides Application Notes and Protocols for evaluating classification accuracy within the specific context of semi-landmark alignment methods, with additional consideration of allometric scaling for analyzing size-shape relationships. These methodologies are central to modern geometric morphometrics (GM), a discipline focused on the statistical analysis of form based on Cartesian landmark coordinates [19] [45].

In outline-based identification research, the use of a limited number of traditional anatomical landmarks often proves insufficient for capturing the full complexity of biological shapes. Semi-landmarks are essential for quantifying these outlines and surfaces [19]. The process of "sliding" these semi-landmarks to establish anatomical correspondence across specimens is a critical step, and its parameters can directly influence downstream analytical outcomes, including the accuracy of group classifications [19]. Furthermore, allometric scaling—the study of how shape changes with size—is a fundamental aspect of morphological analysis, providing insights into developmental and evolutionary patterns [46] [47].

Key Experimental Findings

Recent empirical investigations have quantified the impact of methodological choices on classification accuracy. A pivotal study on 3D human facial images analyzed the effect of iteration count during the sliding process for 484 semi-landmarks. The findings demonstrate that classification accuracy is not a simple function of iteration count and that more iterations do not universally lead to better results. Instead, an optimal threshold exists, after which performance can degrade [19].

Table 1: Effect of Sliding Iterations on Gender Classification Accuracy in 3D Facial Analysis

Number of Sliding Iterations	Processing Time (Seconds)	Peak Classification Accuracy (%)
1	95	92.86
6	125	94.05
12	165	96.43
24	245	94.05
30	305	94.05

Data adapted from a study on 3D human facial images (n=80 subjects) [19].

Concurrently, the application of allometric scaling in pharmacological and morphological research highlights the importance of selecting appropriate scaling exponents. The theoretical exponent of 0.75, derived from interspecies metabolic rate scaling, is often applied but remains a subject of debate. Evidence suggests that a single universal exponent is unlikely and that the value can vary based on drug properties, physiological characteristics, and the specific biological structures under investigation [46] [47].

Table 2: Allometric Scaling Exponents and Their Applications

Scaling Exponent (b)	Theoretical Basis	Common Application Context	Key Considerations
0.75	Kleiber's Law; West, Brown, and Enquist (WBE) theory	Interspecies scaling of basal metabolic rate; often extrapolated to drug clearance [46]	Highly disputed; multiple WBE assumptions are challenged; empirical merit in pediatrics (age >5 years) requires validation [46]
0.67	Surface Area Law	Historical basis for metabolic scaling	Lacks universal support due to assumptions like constant skin temperature [46]
1.0	Linear (Isometric) Scaling	Simple mg/kg dose extrapolation	Can overdose large animals and underdose small animals; only for drugs with a wide therapeutic index [47]
Variable / Empirical	Drug-specific or patient-specific factors	Physiologically-based pharmacokinetic modeling	Accounts for variability driven by the interplay of drug properties and physiology; no presumption of universality [46]

Experimental Protocols

Protocol 1: Evaluating Semi-Landmark Sliding Parameters for Classification

This protocol outlines a procedure to determine the optimal number of iterations for sliding semi-landmarks to maximize classification accuracy in an outline-based identification study.

2.1.1 Workflow Diagram

2.1.2 Step-by-Step Procedure

Data Acquisition and Template Construction: Obtain 3D scans (e.g., in .obj format) of the specimens under study. Select one specimen as a template and manually digitize a set of fixed anatomical landmarks that are homologous across all individuals. The number and placement should be sufficient to capture the overall morphology [19] [45].
Semi-Landmark Placement: On the template specimen, automatically generate a dense set of semi-landmarks along the outlines and surfaces of interest. These points will be projected onto target specimens in the next step. The number of semi-landmarks (e.g., hundreds to thousands) should be determined by the complexity of the morphology [19].
Parameter Setting and Sliding: Define a series of iteration counts (e.g., 1, 6, 12, 24, 30) for the sliding process. For each target specimen, project the semi-landmarks from the template using Thin-Plate Spline (TPS) warping. Then, "slide" the semi-landmarks along tangents to the curves and surfaces for the predefined number of iterations to minimize the bending energy between the template and the target configuration, thereby establishing homology [19].
Shape Variable Extraction: Perform a Generalized Procrustes Analysis (GPA) on the combined set of fixed landmarks and slid semi-landmarks. This step removes the effects of position, scale, and orientation, leaving only the shape information for subsequent analysis [19] [45].
Dimensionality Reduction and Classification: Use Principal Component Analysis (PCA) on the Procrustes-aligned coordinates to reduce dimensionality and extract major axes of shape variation. Use the principal component scores as input for a classification algorithm, such as Linear Discriminant Analysis (LDA), to predict group membership (e.g., species, sex) [19] [45].
Accuracy Evaluation: Calculate the classification accuracy for each set of results corresponding to a different sliding iteration count. Identify the iteration count that yields the highest predictive accuracy, noting that this optimum may vary between datasets [19].

Protocol 2: Integrating Allometric Scaling in Morphological Analysis

This protocol describes how to analyze and account for allometric patterns (shape change due to size) in a dataset of landmark and semi-landmark coordinates.

2.2.1 Workflow Diagram

2.2.2 Step-by-Step Procedure

Size Calculation: From the Procrustes-aligned coordinates, compute the centroid size for each specimen. Centroid size is defined as the square root of the sum of squared distances of all landmarks from their centroid, and it is the standard size metric in geometric morphometrics [45].
Allometric Model Fitting: Regress the Procrustes shape coordinates (or the principal component scores derived from them) against the log-transformed centroid size. This is typically done using multivariate regression (e.g., procD.lm in the R package geomorph) to test the null hypothesis that shape is independent of size [45].
Significance Testing: Evaluate the statistical significance of the regression model using a permutation-based procedure. A significant p-value indicates the presence of a non-random allometric relationship, where shape changes predictably with size [45].
Visualization and Interpretation: If allometry is significant, visualize the predicted shape changes along the size gradient. This is often done by computing the regression vector and projecting it back into the tangent space, allowing for graphical depictions of the shape associated with small versus large sizes using wireframe graphs [45].
Data Adjustment (if required): If the research question demands the analysis of size-independent shape variation, calculate the residuals from the multivariate regression of shape on size. These residuals represent the allometry-free shape data and can be used in subsequent analyses [45].

The Scientist's Toolkit: Research Reagent Solutions

This section details essential software, data, and methodological tools for conducting research in semi-landmark alignment and allometric scaling.

Table 3: Essential Research Tools and Resources

Tool / Resource Name	Type	Primary Function in Research	Application Note
Viewbox [19]	Software	Digitizing landmarks, semi-landmark placement, sliding, and general geometric morphometric analysis.	A commercial software with a user-friendly graphical interface, suitable for complete workflow management from data collection to analysis.
MorphoJ [45]	Software	Comprehensive toolkit for statistical analysis of shape data, including PCA, regression, and allometry analysis.	Widely used in evolutionary and developmental biology; excellent for conducting and visualizing allometric regressions and other standard GM analyses.
R Package `geomorph` [19]	Software	A powerful R-based platform for geometric morphometric analysis.	Highly flexible and reproducible; allows for customized analyses, permutation tests, and integration with other statistical methods in R.
Stirling/ESRC 3D Face Database [19]	Reference Data	A publicly available dataset of 3D facial scans.	Serves as a benchmark dataset for methodological development and testing in 3D shape analysis, particularly for human facial morphology.
Thin-Plate Spline (TPS) Theory [19]	Methodological Framework	Provides the mathematical basis for the interpolation of deformation between landmark configurations.	The underlying algorithm for sliding semi-landmarks and for visualizing shape differences as continuous deformations.
Generalized Procrustes Analysis (GPA) [19] [45]	Methodological Framework	A procedure to remove differences in translation, rotation, and scale from landmark configurations.	A foundational step that produces "Procrustes shape coordinates," the starting point for almost all subsequent shape analyses.
Allometric Scaling Equation (Y = aW^b) [46] [47]	Mathematical Model	Describes the relationship between a physiological or morphological variable (Y) and body weight or size (W).	The exponent `b` is the focus of research; its value (e.g., 0.75 vs. 0.67) and variability are central to debates in scaling theory [46].

The process of semi-landmark alignment is a critical step in outline-based geometric morphometrics (GM), serving as a bridge between raw morphological data acquisition and sophisticated evolutionary biological analyses. Within the context of phylogenetic and evolutionary studies, the methods used to place, slide, and align semi-landmarks are not merely technical preliminaries; they fundamentally influence the quantification of morphological variation and the subsequent estimation of key evolutionary parameters. When semi-landmarks are placed using different algorithms or densities, they can yield different maps of point correspondences across specimens, which in turn can alter statistical results concerning patterns of variation and covariation [15]. This Application Note details how methodological choices in semi-landmark processing directly impact the assessment of phylogenetic signal, morphological disparity, and evolutionary rates, providing validated protocols to ensure analytical robustness in evolutionary morphology studies.

The Impact of Semi-Landmarking Choices on Evolutionary Metrics

Methodological Consistency in Phylogenetic Signal Analysis

The phylogenetic signal quantifies the extent to which closely related species resemble each other, a cornerstone metric in evolutionary biology. The choice of semi-landmarking approach can significantly influence this measurement.

Template Choice Effects: Methods that transfer semi-landmarks from a single template specimen (e.g., patch-TPS) are sensitive to the chosen template. A template with poor overall geometric similarity to the sample can result in semilandmarks being projected onto different anatomical features in target specimens, thereby introducing error and obscuring the true phylogenetic structure [1] [15].
Algorithmic Homology vs. Topographic Correspondence: Landmark-free algorithms, such as Iterative Closest Point (ICP), establish point correspondences based on surface topography and initial alignment rather than postulated evolutionary homologies. This can lead to mappings that differ substantially from those based on biological homology, potentially weakening or distorting the inferred phylogenetic signal [15].

Quantifying and Interpreting Morphological Disparity

Morphological disparity, which measures the volume of morphospace occupied by a group of taxa, is highly sensitive to the density and placement of semi-landmarks.

Increased Shape Information: The primary motivation for using semi-landmarks is to increase the density of shape information in regions lacking discrete landmarks. Studies confirm that all semi-landmarking strategies can produce shape estimations that are comparable to using manual landmarks alone while greatly enriching the data [1].
Risk of Artifactual Inflation: The locations of semi-landmarks depend on the investigator's choice of algorithm and their density. Different approaches can produce different semilandmark locations, which can lead to differences in statistical results, including disparity metrics [15]. Therefore, disparity comparisons between studies using different semi-landmarking protocols should be made with extreme caution.

Influence on Estimates of Evolutionary Rates

Evolutionary rates describe the tempo of phenotypic change across a phylogeny. New phylogenetic comparative methods (PCMs) are explicitly designed to test hypotheses about factors influencing these rates [48].

Model-Based Framework: PCMs often use Brownian Motion (BM) models, where the diffusion rate parameter (( \sigma^2 )) represents the evolutionary rate. Methods now exist to model this rate as a function of another evolving trait (e.g., testing if brain size influences beak evolution in birds) [48].
Data Quality Dependency: The power to detect these effects is inherently low, meaning that even strong causal influences explain only a small fraction of the variance in disparity. In this context, noise introduced by inconsistent or non-homologous semi-landmark placement can further reduce statistical power, potentially masking genuine evolutionary relationships [48].

Table 1: Impact of Semi-Landmarking Approaches on Key Evolutionary Analyses

Analytical Metric	Primary Influence of Semi-landmarking	Key Consideration for Robustness
Phylogenetic Signal	Correspondence quality affects trait covariance estimation.	Use a biologically representative template for landmark transfer; prefer homology-driven placement.
Morphological Disparity	Sampling density and coverage affect morphospace volume.	Keep semi-landmark density and algorithm consistent across compared groups.
Evolutionary Rates	Measurement error in traits can obscure rate heterogeneity.	Employ models that account for estimation error in trait values [48].
General Statistical Power	Inconsistent point locations increase unexplained variance.	Use sliding algorithms to minimize arbitrary placement artifacts [1].

Essential Workflow for Evolutionarily Informative Semi-Landmarking

The following workflow diagram outlines the critical steps for preparing semi-landmark data for downstream phylogenetic and evolutionary analysis, highlighting points where methodological choices have the greatest impact.

Diagram 1: Semi-landmarking workflow for evolutionary analysis.

Protocol: Patch-TPS Semi-Landmarking for Phylogenetic Comparative Data

This protocol is designed to maximize homology and consistency across a phylogenetic sample, which is crucial for meaningful comparative analysis [1].

Template Selection:
- Select a single template specimen that is geometrically representative of the entire dataset (e.g., the specimen with the smallest mean Procrustes distance to all others). This minimizes extreme deformations during warping.
Patch Definition on Template:
- On the template mesh, define triangular regions of interest using three Type I or Type II landmarks as vertices.
- For each patch, generate a template triangular grid with a user-specified density of semi-landmark points.
- Register this grid to the vertices of the bounding triangle using a thin-plate-spline (TPS) deformation.
- Project the vertices of the triangular grid onto the surface of the template specimen using a ray-casting algorithm along the averaged surface normal vector of the three bounding landmarks [1].
Transfer to Target Specimens:
- For each target specimen, compute a TPS transformation based on the correspondence between its manual landmarks and those of the template.
- Warp the entire set of template semi-landmarks to the target specimen using this TPS transformation.
- For each warped semi-landmark, cast a ray along the direction of the template's surface normal vector and find its intersection with the target specimen's mesh. Use this intersection point as the corresponding semi-landmark, ensuring it lies on the anatomical surface [1].
Sliding Semi-Landmarks:
- Slide all semi-landmarks to minimize the bending energy of the TPS relative to the sample's Procrustes consensus shape. This step minimizes the artifacts introduced by the initial arbitrary placement and is critical for reducing error in subsequent analyses [1] [15].
- Perform Generalized Procrustes Analysis (GPA) on the combined set of fixed landmarks and slid semi-landmarks to obtain aligned shape coordinates for analysis.

Protocol: Testing for Phylogenetic Signal and Evolutionary Rates

Once a aligned morphometric dataset is prepared, it can be integrated with phylogenetic trees to test evolutionary hypotheses.

Data Integration:
- Import the Procrustes shape coordinates (the dependent variable) and a time-calibrated phylogenetic tree of the studied taxa into a phylogenetic comparative methods software environment (e.g., R with phytools or geomorph).
Phylogenetic Signal Assessment:
- Calculate Pagel's λ or Blomberg's K to quantify phylogenetic signal in the shape data.
- A λ value not significantly different from 1 indicates that the trait evolution is consistent with a Brownian motion model along the tree. A value of 0 suggests no phylogenetic structure.
Modeling Evolutionary Rates:
- Use models that allow for the estimation of the Brownian motion diffusion rate (( \sigma^2 )) in different parts of the phylogeny or for different traits [48].
- To test hypotheses about factors influencing rates of evolution (e.g., does a binary ecological trait influence the rate of beak shape evolution?), fit models where the rate of evolution for your shape trait is a function of the predictor variable. The significance of the predictor can be assessed using likelihood ratio tests [48].

Table 2: Key Reagents and Software for Evolutionary Morphometrics

Tool Name	Type	Primary Function in Analysis
3D Slicer / SlicerMorph	Software Module	Platform for 3D visualization, manual landmarking, and implementing semi-landmarking protocols like Patch and Patch-TPS [1].
Morpho	R Package	Statistical shape analysis; includes functions for sliding semi-landmarks and performing Procrustes-based analyses [1].
Geomorph	R Package	Comprehensive tool for GM; integrates Procrustes ANOVA with phylogenetic comparative methods to analyze shape variation and evolutionary rates [1].
PhyloNetworks	Julia Package	Extends phylogenetic comparative methods to phylogenetic networks, allowing for the modeling of trait evolution (including shifts at hybridization events) beyond pure tree models [49].
TPS Series (tpsDig2, tpsRelw)	Standalone Software	Digitizing landmarks and semi-landmarks on 2D images, and performing relative warps analysis [17].

The path from digitized biological forms to evolutionary insights is paved with methodological decisions that directly influence scientific conclusions. The protocols outlined herein provide a robust framework for semi-landmark alignment that safeguards the integrity of downstream analyses of phylogenetic signal, disparity, and evolutionary rates. By adhering to a consistent, homology-aware workflow—from careful template selection and semi-landmark sliding to the use of appropriate comparative models—researchers can minimize methodological artifacts and strengthen the biological validity of their findings in evolutionary morphology.

In geometric morphometrics, the analysis of biological shapes often relies on landmarks and semilandmarks to quantify complex forms. However, the raw coordinates of individual semilandmarks are biologically meaningless without undergoing specific computational procedures to establish geometric homology across specimens. This application note elucidates the mathematical and biological rationale behind this limitation, provides detailed protocols for proper semilandmark implementation, and presents standardized workflows for researchers applying outline-based identification methods in evolutionary biology, paleontology, and drug development contexts.

The Fundamental Problem of Isolated Semilandmarks

Semilandmarks are points placed on curves and surfaces to quantify morphological features lacking discrete anatomical landmarks [2]. Unlike traditional landmarks that represent biologically homologous points (e.g., sutures, foramina), semilandmarks do not possess inherent biological correspondence across specimens in their initial placement.

Table 1: Key Differences Between Landmarks and Semilandmarks

Characteristic	Traditional Landmarks	Semilandmarks
Basis of Homology	Established biological correspondence	Geometrical correspondence after processing
Placement Method	Manual identification of homologous structures	Algorithmic placement along curves and surfaces
Initial Biological Meaning	Yes	No
Dependence on Processing	Minimal	Critical for biological interpretation
Information Content	Meaningful as individual points	Meaningful only as part of configured set

A single semilandmark coordinate lacks biological meaning because its position is initially determined by algorithmic placement rather than biological homology [15]. The raw coordinates represent arbitrary points along a curve or surface until they are "slid" to establish geometrical correspondence across specimens. This process minimizes the bias introduced by initial arbitrary placement and establishes equivalent anatomical positions throughout a dataset [2].

Mathematical Foundation of Semilandmark Processing

The Sliding Procedure

Semilandmarks require sliding procedures to optimize their positions along tangent directions to curves or surfaces. This sliding is typically achieved through one of two criteria:

Bending Energy Minimization: Positions semilandmarks by minimizing the thin-plate spline bending energy between specimens [2]
Procrustes Distance Minimization: Slides semilandmarks to minimize the Procrustes distance between specimens [15]

Both approaches effectively remove the arbitrary component of semilandmark placement while preserving the geometrical information of the biological structure.

Quantitative Consequences of Improper Interpretation

Table 2: Statistical Implications of Treating Semilandmarks as Landmarks

Analysis Type	With Proper Sliding	Without Proper Sliding
Procrustes Variance	Biologically meaningful	Inflated by arbitrary placement
PCA Results	Reflects true shape variation	Confounds biological and placement variance
Modularity Tests	Accurate covariance structure	Spurious covariance patterns
Allometric Analyses	Valid size-shape relationships	Biased regression coefficients
Classification Accuracy	High discriminant power	Reduced significantly

Analyses using unslid semilandmarks incorporate substantial error variance from the arbitrary initial placement, potentially leading to incorrect biological interpretations [15]. Studies comparing semilandmarking approaches have demonstrated that different sliding methods can yield different statistical results, emphasizing the need for methodological consistency within studies.

Experimental Protocols for Semilandmark Implementation

Comprehensive Semilandmark Placement and Sliding Protocol

Materials and Software Requirements

3D digitization equipment (CT scanner, surface scanner)
Geometric morphometrics software (MorphoJ, EVAN Toolbox, R geomorph)
Template specimen with established landmark configuration

Step-by-Step Procedure

Landmark Definition
- Identify and digitize traditional anatomical landmarks across all specimens
- Ensure landmarks represent biologically homologous points
- Record 3D coordinates for all traditional landmarks
Curve Semilandmark Placement
- Define curves between established landmarks using template specimen
- Place semilandmarks equidistantly along curves
- Ensure adequate sampling density to capture morphological complexity
Surface Semilandmark Placement
- Select template specimen representing average morphology
- Define surface patches bounded by landmarks and curves
- Place semilandmarks across surfaces using grid or evenly spaced pattern
Template Warping and Semilandmark Transfer
- Warp template to each target specimen using thin-plate spline interpolation
- Project semilandmarks from warped template to target specimens
- Alternatively, use nearest-point matching for semilandmark transfer
Sliding Procedure
- Choose sliding criterion (bending energy or Procrustes distance)
- Implement iterative sliding algorithm:
  - a. Perform Procrustes superimposition of all specimens
  - b. Compute reference (consensus) configuration
  - c. Slide each semilandmark along tangent direction to minimize criterion
  - d. Iterate until convergence (minimal change in landmark positions)
Validation and Quality Control
- Visualize slid semilandmarks to ensure biological plausibility
- Check for outliers and irregular sliding patterns
- Verify that semilandmarks maintain relative positions along curves/surfaces

Protocol for Outline-Based Identification Studies

For 2D outline analyses common in identification research:

Image Acquisition
- Standardize imaging conditions (resolution, orientation, scale)
- Ensure consistent background contrast
- Calibrate spatial measurements using scale bars
Outline Digitization
- Convert images to binary format
- Extract outline coordinates using edge detection algorithms
- Resample outlines to equal number of points
Semilandmark Configuration
- Define start/end points as fixed landmarks
- Place semilandmarks along outline between fixed landmarks
- Implement sliding based on minimum bending energy criterion
Data Analysis
- Perform Generalized Procrustes Analysis on combined landmarks/semilandmarks
- Conduct statistical analyses on Procrustes coordinates
- Validate using cross-validation techniques

Visualization Framework

Diagram 1: Semilandmark Processing Workflow (76 chars)

Diagram 2: Mathematical Basis of Semilandmark Meaning (76 chars)

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 3: Critical Research Reagents for Semilandmark Studies

Reagent/Material	Function	Application Notes
High-Resolution CT Scanner	3D specimen digitization	Minimum 50μm resolution for detailed morphology
Geometric Morphometrics Software	Data processing and analysis	MorphoJ, EVAN Toolbox, R geomorph package
Landmark Template	Standardized landmark protocol	Must be validated for specific research question
Sliding Algorithm Scripts	Semilandmark processing	Customizable for bending energy or Procrustes criteria
Validation Dataset	Methodological verification	Specimens with known morphological relationships
Digital Specimen Archive	Raw data repository	Maintain original scans and landmark files

Applications in Outline-Based Identification Research

In outline-based identification studies, such as carnivore tooth mark analysis or pharmaceutical compound morphological screening, semilandmarks enable quantification of shape features lacking discrete landmarks [50]. Proper implementation requires:

Standardized Outline Acquisition: Consistent imaging protocols to minimize technical variance
Appropriate Semilandmark Density: Sufficient points to capture shape, but avoiding oversampling
Consistent Sliding Methodology: Uniform application of sliding criteria across all specimens
Validation Against Known Groups: Testing method discriminative power with specimens of known origin

Studies demonstrate that 3D semilandmark approaches outperform 2D outline methods in classification accuracy, though both require proper sliding procedures to yield biologically meaningful results [50].

Single semilandmark coordinates possess no inherent biological meaning without undergoing sliding procedures that establish geometric correspondence across specimens. Researchers must recognize that the analytical validity of semilandmark-based studies depends entirely on proper implementation of these computational methods. The protocols and visualization frameworks presented here provide standardized approaches for ensuring biological relevance in outline-based identification research across evolutionary, anthropological, and pharmaceutical contexts.

In the field of geometric morphometrics, the analysis of biological shapes often relies on the precise quantification of homologous points known as landmarks. However, many biological structures, such as curves and outlines, lack sufficient traditional landmarks for comprehensive shape analysis. Semi-landmarks were developed to address this limitation by allowing researchers to sample points along curves and surfaces between traditional landmarks [7]. These mathematical points are not homologous in the developmental or evolutionary sense but are positioned algorithmically to capture the geometry of morphological features lacking discrete anatomical landmarks [8]. The fundamental challenge in semi-landmark analysis lies in the fact that different alignment methods can produce different point locations, which subsequently influences statistical results and biological interpretations [7] [8]. Understanding when these methods converge (produce similar results) or diverge (produce meaningfully different results) is therefore critical for ensuring robust morphological analyses across various research domains, including taxonomy, ecology, evolutionary biology, and biomedical applications [51] [18] [52].

Table 1: Categories of Landmarks Used in Geometric Morphometrics

Landmark Type	Definition	Basis for Homology	Examples
Type I (Anatomical)	Discrete anatomical points	Developmental/evolutionary homology	Junction between bones, tip of nose [17]
Type II (Mathematical)	Points defined by geometric properties	Geometric properties	Point of maximum curvature, deepest point in a notch [17]
Type III (Constructed)	Points defined by relative position	Geometric relationship to other landmarks	Midpoint between two landmarks, evenly spaced points along a curve [17]
Semi-landmarks	Algorithmically placed points	Mathematical mapping between landmarks	Points along curves or surfaces between Type I-III landmarks [7] [8]

Comparative Performance of Semi-Landmarking Approaches

Common Semi-Landmarking Algorithms and Their Mechanisms

Multiple algorithmic approaches have been developed to place semi-landmarks on biological structures, each with distinct theoretical foundations and operational mechanisms. The sliding semi-landmarks approach remains the most widely used method in biological research. This technique involves initially placing points along curves or surfaces between traditional landmarks, then "sliding" them to minimize either the bending energy of thin-plate splines (TPS) or the Procrustes distance among specimens [7] [8]. The bending energy minimization approach gives greater weight to landmarks and semi-landmarks that are local to the points being slid, while Procrustes distance minimization considers all points in the configuration equally [7].

Alternative approaches have been adapted from computer vision applications. Rigid registration methods, such as those based on the Iterative Closest Point (ICP) algorithm, involve rigidly aligning a template specimen to each target specimen by iteratively minimizing distances between corresponding points [8]. The non-rigid ICP (NICP) method represents a hybrid approach that first uses TPS for initial non-rigid registration, then applies NICP to further warp the template surface to each specimen [8]. Unlike sliding methods, these registration-based approaches transfer semi-landmarks from a template specimen to target specimens based on surface correspondence rather than biological homology [7].

Quantitative Comparisons of Methodological Performance

Recent empirical studies have systematically evaluated how different semi-landmarking approaches affect analytical outcomes in geometric morphometrics. Research comparing sliding TPS, hybrid rigid registration (LS&ICP), and non-rigid registration (TPS&NICP) approaches has revealed important patterns of convergence and divergence [8].

Table 2: Performance Comparison of Semi-Landmarking Approaches

Methodological Aspect	Sliding TPS	LS&ICP (Rigid)	TPS&NICP (Non-rigid)
Theoretical Basis	Minimization of bending energy or Procrustes distance	Rigid registration using landmark matching and ICP	Initial TPS deformation followed by non-rigid ICP
Biological Homology Consideration	High (guided by landmarks)	Low (based on surface matching)	Moderate (initial landmark guidance)
Consistency with Increasing Density	High	Variable	High
Similarity to Other Methods	Most similar to TPS&NICP	Diverges from sliding methods	Most similar to sliding TPS
Sensitivity to Landmark Coverage	Lower when landmarks are dense	Higher regardless of landmark density	Lower when landmarks are dense
Recommended Application Context	Evolutionary and developmental studies	Classification and discrimination tasks	When dense surface correspondence is needed

Studies analyzing both human head and ape cranial datasets have found that sliding TPS and TPS&NICP approaches yield results that are more similar to each other than those derived from rigid registration methods (LS&ICP) [8]. This convergence between sliding and non-rigid registration methods is particularly strong when traditional landmarks provide adequate coverage of the morphological structure being analyzed. The consistency of these methods also improves with increasing semi-landmark density when landmarks are well-distributed [8].

Conversely, rigid registration methods often diverge from both sliding and non-rigid approaches, particularly when analyzing complex morphological structures with high shape variation [8]. This divergence appears most pronounced in regions distant from traditional landmarks, where the mathematical mapping of semi-landmarks depends more heavily on the specific algorithm employed [7]. The practical implication is that rigid registration methods may be less suitable for studies aiming to describe biological transformations in an evolutionary or developmental context, where homology assumptions are critical.

Experimental Protocols for Methodological Comparison

Protocol 1: Evaluating Methodological Convergence in Specific Morphological Contexts

Objective: To determine whether different semi-landmarking approaches produce convergent results for a specific biological structure and research question.

Materials and Specimens:

High-resolution 2D or 3D digital images of biological specimens
Software for geometric morphometrics (e.g., TPS series, MorphoJ, R with geomorph or Momocs packages) [17]
Template specimen with defined landmark and semi-landmark configuration

Procedure:

Image Preparation and Landmarking
- Acquire standardized images of specimens using consistent orientation and scale [17] [18]
- Identify and digitize Type I, II, and III landmarks on all specimens using software such as tpsDig2 [17]
- Create a template configuration with desired semi-landmark density using tpsUtil [17]

Apply Multiple Semi-Landmarking Approaches
- Process specimens using sliding semi-landmarks with bending energy minimization
- Process the same specimens using rigid registration (LS&ICP) approach
- Process specimens using non-rigid registration (TPS&NICP) approach
- For all approaches, use the same template configuration and semi-landmark density
Statistical Comparison of Results
- Perform Generalized Procrustes Analysis (GPA) separately for each methodological output [17] [52]
- Calculate Procrustes distances between mean shapes derived from different methods
- Compare principal component analyses (PCA) from each method by correlating PC scores [8]
- Perform multivariate analysis of variance (MANOVA) to test for significant differences in mean shape between methods
Biological Interpretation Assessment
- Conduct discriminant function analysis (DFA) for group classification using each methodological output [17] [18]
- Compare classification success rates between methods
- Visualize shape deformations using thin-plate splines for each method [17]
- Assess whether biological conclusions about group differences are consistent across methods

Figure 1: Workflow for comparing semi-landmark method performance on biological data.

Protocol 2: Assessing the Impact of Semi-Landmark Density on Analytical Outcomes

Objective: To determine how the number of semi-landmarks influences results and whether convergence between methods is density-dependent.

Materials:

A subset of specimens (≥20) representing morphological variation in the dataset
R statistical software with custom scripts for density manipulation
Mesh comparison software (e.g., MeshLab)

Procedure:

Create Multi-Density Template Configurations
- Generate template configurations with different semi-landmark densities (low, medium, high)
- Maintain identical traditional landmark positions across all density levels

Process Specimens Across Density Levels
- Apply each semi-landmarking method (sliding TPS, LS&ICP, TPS&NICP) at each density level
- Perform Generalized Procrustes Analysis for each method-density combination
Quantify Density Effects
- Calculate Procrustes variance for each method at each density level
- Compare principal component scatter at different densities using Procrustes distance
- Assess allometric patterns by regressing shape on size at each density level [8]
Evaluate Surface Reconstruction Accuracy
- Warp template surfaces to mean shapes derived from each method-density combination [8]
- Compute surface-to-surface distances between reconstructions from different methods
- Identify regions where methods diverge most significantly in surface estimation

Essential Research Tools and Reagents for Semi-Landmark Studies

Table 3: Essential Research Toolkit for Semi-Landmark Alignment Studies

Tool Category	Specific Tools	Primary Function	Application Context
Digitization Software	tpsDig2, ImageJ	Landmark digitization on 2D images	Initial landmark collection [17]
Data Management	tpsUtil	Create, edit, and manage landmark files	Template creation and data organization [17]
Sliding Semilandmarks	tpsRelw, MorphoJ	Slide semi-landmarks to minimize bending energy	Traditional sliding approaches [17]
Registration Approaches	auto3dgm, ALPACA	Rigid and non-rigid registration	Alternative semi-landmark placement [7] [8]
Statistical Analysis	R (geomorph, Momocs)	Multivariate shape statistics	Shape analysis and visualization [17] [18]
Visualization	MorphoJ, EVAN Toolbox	Shape deformation visualization	Thin-plate spline visualization [17]

Interpretation Framework for Methodological Convergence

The decision to use specific semi-landmarking approaches should be guided by both methodological performance and research objectives. Studies with evolutionary or developmental questions requiring biological homology should prioritize sliding semi-landmark methods, which explicitly incorporate landmark guidance in semi-landmark placement [7] [8]. For applications focused primarily on discrimination or classification (e.g., taxonomic identification, clinical diagnosis), multiple approaches may be acceptable, particularly if they produce convergent results for the specific morphological structures under investigation [18] [8].

When different methods produce divergent results, researchers should carefully consider the potential causes. Strong divergence may indicate that morphological patterns are sensitive to specific analytical decisions, suggesting that conclusions should be tempered with appropriate caution [7]. In such cases, it may be prudent to focus on morphological patterns that are robust across multiple methodological approaches rather than relying on results from a single method.

The finding that sliding TPS and non-rigid registration approaches often converge is methodologically reassuring, suggesting that these methods can provide consistent descriptions of biological shape variation when traditional landmarks provide adequate coverage [8]. However, the divergence of rigid registration methods highlights that algorithm choice matters, particularly for structures with complex topography or sparse landmark coverage [7] [8].

Figure 2: Decision framework for interpreting methodological convergence in different research contexts.

The convergence of different semi-landmarking methods depends on multiple factors, including the specific algorithms being compared, the density and coverage of traditional landmarks, the complexity of the morphological structure, and the specific research question. Sliding TPS and non-rigid registration methods typically show the strongest convergence, particularly with adequate landmark coverage and appropriate semi-landmark density [8]. Rigid registration methods often diverge from both sliding and non-rigid approaches, suggesting they may be less suitable for studies requiring biological homology assumptions [7] [8]. Researchers should implement methodological comparisons specific to their morphological systems of interest to determine whether their biological conclusions are robust to alternative semi-landmarking approaches. This practice will enhance the reliability and interpretability of geometric morphometric analyses across biological and biomedical research domains.

Conclusion

Semi-landmark methods are indispensable for capturing comprehensive shape data in outline-based identification, but they introduce methodological choices that directly influence analytical outcomes. There is no single 'best' method; the choice between sliding semilandmarks, pseudolandmarks, or landmark-free approaches must be guided by the research question, the degree of morphological disparity in the dataset, and the need for biological interpretability. Future directions point toward increased automation and the integration of AI-assisted landmarking to enhance reproducibility, while a critical, cautious interpretation of results remains paramount. For biomedical research, this means that while these methods powerfully quantify subtle shape variations—potentially relevant for distinguishing pathological phenotypes or developmental patterns—their findings should be viewed as robust approximations, always validated by biological knowledge.