Abstract. Language is closely related to how we perceive ourselves and signify our reality. In this scope, we created Emotional Machines, an interactive art project that promotes the experience of affective virtual environments adopting speech emotion as the leading source of control. Users can express their desires by speaking, singing, reciting poetry, or making vocal sounds to create an audiovisual representation in virtual reality using a head-mounted display. Our contribution combines two machine learning models. We propose a long-short term memory and a convolutional neural network machine learning model to predict four main emotional categories from high-level semantic and low-level speech features. Predicted emotions are mapped to audiovisual representations by an end-to-end process encoding emotion in virtual environments. We use a generative model of chord progressions to transfer speech emotion into music based on the tonal interval space. Also, we implement a generative adversarial network to synthesize an image from text to speech. The generated visuals are used as the style image in the style-transfer process onto an equirectangular projection of a spherical panorama selected for each emotional category. The result is an immersive virtual space encapsulating emotions in spheres disposed into a 3D environment. Users can create new affective representations or interact with other previous encoded instances. Users can create new affective representations or interact with other previous encoded instances. In this project, we evidentiate the need to build spaces stripped of all lack and provided by desire as a creative power because an intelligent machine is an emotional machine. A machine that desires to be desired.
Keywords: Affective Computing, Speech Emotion Recognition, Intelligent Virtual Environments, Virtual Reality, Tonal Interval Space, and Machine Learning.
Leave a Reply