Introduction to Speech Technology

Introduction to Speech Technology

Course Overview

This course explains the basics of speech synthesis and recognition and briefly touches upon the history of speech recognition and synthesis. The speech resources needed for creating voice technology applications are addressed, with students acquiring essential knowledge on data management requirements, privacy, and ethics issues.

Program: MSc Speech Technology
Level: Graduate
Credits: 5 ECTS
Semester: 1a (Fall)
Language: English

Learning Outcomes

Upon successful completion of this course, students will be able to:

  1. Explain the history of voice technology
  2. Explain the basic elements of speech synthesis and recognition and elaborate on the challenges and solutions in specific use cases
  3. Identify data resources for voice technology applications
  4. Describe data management requirements for collecting and storing speech and speaker data
  5. Elaborate on data management, licensing and privacy issues concerning speech and speaker data
  6. Discuss how human factors and context affect human-voice technology system interaction
  7. Describe how to investigate user acceptance of voice technology applications
  8. Collaborate to build and present a simple practical voice technology application

Course Structure

This course employs a hands-on approach where students develop their own speaking clock application while learning key concepts in speech technology. The course includes:

  • Lectures on speech synthesis and recognition fundamentals
  • Workshops on data management and ethical considerations
  • Group project work on building voice applications
  • Discussions on human-computer interaction principles

Assessment

  • Individual reports (40%)
  • Individual summer project (15%)
  • Group project (30%)
  • Research Data Management Plan (5%)
  • Participation (10%)

Teaching Philosophy

My approach to teaching this course emphasizes practical application of theoretical concepts. Students are encouraged to think critically about the technical, ethical, and user experience aspects of voice technology development. The speaking clock project serves as a vehicle for learning that integrates all these dimensions.