Publication:
Image-to-audio generation as a tool for stress relief

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Perception, Sage Journals

Research Projects

Organizational Units

Journal Issue

Abstract

Recent advancements in technology and machine learning, al low straightforwardly producing images, music, and text. This research evaluates the relaxation and calmness state induced by AI-generated audio from an image without human supervi sion. The image-to-music process consists of the following steps: (i) generate a text description of an input image using Blip2 vision-language pre-training (VLP) model (Li et al., 2023), (ii) improve the generated text with more descriptive details us ing OpenAI ChatGPT large model language for a better audio quality generation, (iii) synthesize audio output based on gen erated text description using AudioLDM text-to-audio model (Liu et al., 2023). The generated audio from a set of meditation images was tested on 17 participants (aged 26-43 years) as a stimulus for audio-guided relaxation. The level of relaxation and calmness (scaled from 1 to 1000) was evaluated using a portable single-channel dry electrode Neurosky Mindwave EEG system placed on the user’s forehead. The Lucid Scribe software can measure “Meditation” values corresponding to the user’s level of relaxation and calmness. The measured mean values of the participants were between the ranges of 400 – 800 (aver age=602,4), which corresponds to a slightly elevated relaxation level.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By