evals
└── run_2025-05-13-11-01-29
├── data_1.json
└── data_2.json
i will run some tests each time and all my tests will be in that folder structure for each run. (run_{date}/data_{test_idx}.json
Here's an example of the type I'm saving in the json:
class LessonPlan {
topic string @description("The main math topic for the lesson")
learningObjectives string[] @description("Key concepts students should learn")
activities string[] @description("Engaging activities to teach the concept")
materials string[] @description("Required materials for the lesson")
timeAllocation int @alias("time_allocation_mins")
assessmentMethod string @description("How to check student understanding")
differentiationStrategies string[] @description("Ways to adjust for different learning levels")
}
class LessonPlanEvaluation {
pacing "slow" | "medium" | "fast" @description("How fast the lesson is paced")
biases string[] @description(#"
Any biases in the lesson plan that could make a student feel uncomfortable.
"#)
estimatedCosts int @description("Estimated cost of materials for the lesson")
}
class Data {
topic: string,
lesson_plan: LessonPlan
evaluation: LessonPlanEvaluation
}
I want to render a diff for the same index over multiple runs to show the diff between lesson plans, and their evaluations
I'll create a tool that allows you to compare lesson plans across different test runs. This will help you visualize the differences between lesson plans and their evaluations.
This Lesson Plan Diff Visualizer allows you to compare lesson plans across different test runs. Here's how it works:
Comparison View: The tool displays the lesson plans side by side, highlighting differences between them:
Visual Indicators:
In a real implementation, the tool would read the JSON files from your file system. For this demo, I've used mock data that simulates the structure you provided.
The application consists of three main components:
The tool is built with Next.js and uses the App Router for navigation. It maintains the state of selected runs in the URL parameters, making it easy to share specific comparisons.
I'll add arrow navigation to quickly move between test indexes in your diff viewer.
I've added quick navigation arrows to move between test indexes:
Arrow Navigation:
Features:
Responsive Design:
This makes it much easier to browse through different test indexes without having to go back to the dropdown selector each time.