Jaccard cannot see past shared skin

Observation

The operator-directed question — can UHT detect dangerous analogies like “money flows like water”? — has a split answer. UHT catches that one easily: money and water share almost nothing at the trait level, scoring a Jaccard of 0.056. But the atom-solar-system analogy, which every physicist knows is structurally wrong, scores 0.375 — higher than three of the four valid analogy pairs. Electricity and blood, another misleading pair, score 0.333. The valid analogies (heart/pump, brain/computer, immune-system/firewall, eye/camera) cluster between 0.250 and 0.389. The distributions overlap.

Evidence

Four valid analogy pairs: heart–pump 0.250, brain–computer 0.389, immune-system–firewall 0.263, eye–camera 0.267. Mean: 0.292. Four misleading analogy pairs: money–water 0.056, atom–solar-system 0.375, society–organism 0.190, electricity–blood 0.333. Mean: 0.239. The mean difference is 0.053, barely crossing the 0.05 confirmation threshold. But the second criterion — no overlap in score ranges — fails decisively. The valid range [0.250, 0.389] sits entirely inside the misleading range [0.056, 0.375].

The atom–solar-system pair shares traits like Physical Object, Observable, and Compositional. These are real shared properties — both are indeed physical, observable, composed of parts. What they don’t share is the structural relationship between those parts (orbits vs. probability clouds), and UHT’s current trait set doesn’t encode that distinction.

Interpretation

UHT Jaccard functions as a surface-property detector, not a structural-role detector. When a misleading analogy is superficially absurd (money/water), UHT catches it because the entities share almost no traits. When a misleading analogy is plausible at the property level but wrong at the relational level (atom/solar-system), UHT misses it because its traits describe what things are, not how their parts relate to each other.

This is not a failure of UHT — it’s a boundary condition. Jaccard over flat trait vectors cannot encode relational structure. A useful analogy guardrail would need either trait weighting (down-weighting physical-property traits, up-weighting functional-role traits) or a layer-specific metric that isolates relational traits from material ones.

Action

HYP-ACTIVEHYPOTHESES-007 moved to closed as inconclusive (now HYP-CLOSEDHYPOTHESES-007). Result recorded as RES-CALIBRATIONRESULTS-007 with trace link. Observation OBS-RESEARCHGAPS-013 created proposing a trait-weighted similarity metric as a research direction. HYP-ACTIVEHYPOTHESES-006 (context sensitivity for multi-word entities) remains the next untested hypothesis. A follow-up session should test whether isolating specific trait subsets (e.g., only functional traits) improves analogy discrimination — this could become a new hypothesis derived from OBS-013.

← all entries