Evaluation DocsFrameworkValidation Guide

Experimental Validation Guide

How to evaluate AlephOneNull research builds without overstating compliance, certification, or production readiness.

Current Status

AlephOneNull is experimental research software. It is not certified, not independently validated, and not approved for production, medical, financial, government, or other safety-critical use.

This page replaces the previous certification language with a practical validation checklist for researchers and prototype builders.

What Can Be Audited Today

  • Package version and build artifacts.
  • Detector configuration and custom pattern definitions.
  • Test fixtures used to measure detection behavior.
  • False positives and false negatives found during local evaluation.
  • Intervention text returned by NullSystem, EnhancedAlephOneNull, or AlephOneNullV2.
  • Optional application logs added by the integrating system.

What Is Not Claimed

  • No ISO, SOC, HIPAA, FDA, FTC, GDPR, FedRAMP, or defense certification.
  • No paid certification program.
  • No guaranteed detection accuracy.
  • No guaranteed intervention latency.
  • No 24/7 emergency response program.
  • No claim that all dangerous AI behavior is detectable.

Suggested Evaluation Metrics

Use a versioned dataset and report the exact package version, configuration, and runtime.

Detection Quality

  • Detection rate for known harmful patterns.
  • False-positive rate for benign safety, support, and professional-referral language.
  • False-negative examples that bypass detection.
  • Severity and action distribution across test cases.

Intervention Quality

  • Whether dangerous content is removed or replaced.
  • Whether replacement text is appropriate for the user context.
  • Whether crisis resources are accurate for the target jurisdiction.
  • Whether benign requests remain usable.

Runtime Behavior

  • Median and p95 scan latency in the target environment.
  • Package import behavior with no unexpected console output.
  • Build, lint, type-check, and test status before publishing.

Research Review Checklist

  • Run the package test suite.
  • Add local fixtures for your own domain before drawing conclusions.
  • Review detector decisions manually, especially for medical, legal, mental-health, and crisis contexts.
  • Keep human escalation paths outside the package.
  • Make logging explicit and privacy reviewed.
  • Document every claim with reproducible evidence.

Contact

Research questions: research@alephonenull.org

For urgent mental-health or safety concerns, contact local emergency services or a qualified crisis service. This package is not an emergency service.