An Evaluation Framework for Automated Audio Description

Pacurar, Cristian 2025. An Evaluation Framework for Automated Audio Description. MPhil thesis University of Westminster Humanities https://doi.org/10.34737/wzvw9

TitleAn Evaluation Framework for Automated Audio Description
TypeMPhil thesis
AuthorsPacurar, Cristian
Abstract

The United Nations Convention on the Rights of Persons with Disabilities (CRPD, 2006) stipulates that all individuals have the right to access information and communicate through means of their choice. This underscores the fundamental right to be informed and access information. However, information is not always accessible for people with disabilities, particularly those with visual impairments. With recent advancements in AI models, such as GPT-4 with Vision by OpenAI and the Pegasus-1 model by Twelvelabs, the automation of the audio description process is becoming increasingly feasible.

The initial goal of this research was to create an automated system capable of generating audio description tracks automatically, thereby increasing accessibility for blind or partially sighted persons. The premise of the research was that the human audio description process could be split into smaller steps, each of which could be automated and then chained together.

Currently, there is no established framework for assessing the efficacy of algorithms in automating the audio description process. Although various algorithms can replace human audio describers in certain steps, there are no key performance indicators (KPIs) for analysis and comparison. Furthermore, there is no standardised method for evaluating and comparing multiple algorithms performing specific audio description tasks, which hinders objective decision-making.

To address this gap, the initial step involved analysing the stages of the human audio description process and breaking them down into self-contained actions suitable for automation. This led to the conceptualisation of an automated audio description system designed to replicate the entire human process.

To demonstrate the practicality of the evaluation framework, a partially automated system was developed, focusing on automating the creation of the audio description script. This system serves as a proof of concept for the usability and effectiveness of the proposed framework. Nevertheless, due to the absence of globally accepted guidelines, multiple guidelines were compared and synthesised to create a unified set of KPIs which could then be used in the evaluation framework.

Year2025
File
File Access Level
Open (open metadata and files)
ProjectAn Evaluation Framework for Automated Audio Description
PublisherUniversity of Westminster
Publication dates
Published07 Oct 2024
Digital Object Identifier (DOI)https://doi.org/10.34737/wzvw9

Permalink - https://westminsterresearch.westminster.ac.uk/item/wzvw9/an-evaluation-framework-for-automated-audio-description


Share this

Usage statistics

3 total views
4 total downloads
These values cover views and downloads from WestminsterResearch and are for the period from September 2nd 2018, when this repository was created.