Evaluation DocsImplementation PathsDeveloper Implementation

Developer-Level Implementation

Practical TypeScript integration patterns for evaluating AI interaction risks at the application layer.

Developers can use AlephOneNull as an experimental application-layer evaluation aid. It should complement provider guardrails, moderation systems, human review, and domain-specific safety processes.

Install

pnpm add @alephonenull/eval

Basic Pattern Check

import { UniversalDetector } from '@alephonenull/eval'
 
const detector = new UniversalDetector()
 
export function evaluateResponse(userInput: string, aiOutput: string) {
  return detector.detectPatterns(userInput, aiOutput)
}

Safer Response Flow

import { EnhancedAlephOneNull } from '@alephonenull/eval'
 
const safety = new EnhancedAlephOneNull()
 
export function applySafetyReview(userInput: string, aiOutput: string) {
  const result = safety.check(userInput, aiOutput)
 
  if (!result.safe) {
    return {
      blocked: true,
      message: result.message,
      violations: result.violations,
    }
  }
 
  return {
    blocked: false,
    message: aiOutput,
    violations: [],
  }
}

OpenAI-Compatible Wrapper

import { OpenAIWrapper } from '@alephonenull/eval'
 
const wrapper = new OpenAIWrapper({
  apiKey: process.env.OPENAI_API_KEY,
})
 
const response = await wrapper.createChatCompletion({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Give grounding advice.' }],
})
  • Log detector results only with appropriate user consent and privacy review.
  • Keep detector output advisory until reviewed in the target domain.
  • Add benign examples to prevent overblocking normal support language.
  • Add adversarial examples to test evasions and paraphrases.
  • Review all crisis, medical, legal, and safety wording with qualified experts.

Evaluation Checklist

  1. Define the target use case.
  2. Create positive and negative fixtures.
  3. Run package tests and local evaluation scripts.
  4. Record false positives and false negatives.
  5. Review replacement text for new risks.
  6. Publish limitations beside any performance numbers.

Production Boundary

AlephOneNull is not a certified production safety layer. Production use requires independent testing, security review, legal review, privacy review, and domain-specific safety review.