What makes an agent skill reliable, performant, and maintainable? We will explore a robust approach to skill design, starting with foundational best practices, moving into automated skill generation, and validation. The second half of the talk focuses on the critical role of evaluation, demonstrating how tools like SkillGrade and benchmarks like SkillBench allow developers to catch regressions and ensure their agents behave predictably in complex environments.