홈으로ArticlesAll Issue
ArticlesSelecting Video Stimuli for Emotion Elicitation via Online Survey
  • Sharifah Noor Masidayu Sayed Ismail1, Nor Azlina Ab. Aziz2,*, Siti Zainab Ibrahim1, ChyTawsif Khan2, and Md. Armanur Rahman2

Human-centric Computing and Information Sciences volume 11, Article number: 36 (2021)
Cite this article 3 Accesses


Video stimulus is commonly used to induce different emotional states. Numerous sets of stimulus materials were produced in recent years; however, sets that include Asian clips are still inadequate. This study identified and validated 24 videos expected to elicit specific emotional reactions in a two-dimensional model of valence and arousal. The videos consist of excerpts from movies, TV shows, and advertisements from various regions, including Asia. The study was conducted during the COVID-19 pandemic; therefore, instead of the traditional approach of physical sessions in the laboratory, online surveyswere conducted to collect responses from 42 participants. The findings show that 79% of the videos successfully evoked the targeted emotions. The participants’ demographic factors, such as age, gender, race, nationality, and place of residence, were taken into account to explore and understand the different perspectives among the participants towards the videos. The outcomes disclosedthat all selected videos are gender-neutral. The emotions elicited by several videos revealed significant differences among people of different races and nationalities. This finding indicates that the background and culture affected one’s perspective and, subsequently, the emotion.


Affective Computing, Cloud, COVID-19, Emotion Elicitation, Demography, Asian Clips


Emotion recognition system (ERS) is a hot topic, especially in the field of affective computing with the prospect of machines that can recognize human emotions [1]. Undeniably, the introduction of automated human emotional recognition systems greatly benefited various fields,namely computer science, bioinformatics, automotive, biomedical engineering, artificial intelligence, healthcare, and multimedia [24].The recent coronavirus disease 2019 (COVID-19) pandemic has caused many losses of life around the world and has seen many researchers from varied fields come out with various solutions to fight the deadly virus and its impact on society. Other than vaccine development by scientists, computer scientists also contributed to the COVID-19 battle via smart diagnostic systems [5, 6]as well as outbreak forecasting and prediction systems[7, 8]. The diagnostic systems help medical practitioners analyzethe patients’ data and determine their condition;hence, good decisions and treatment plans can be implemented. Meanwhile, the forecasting system can assist authorities in deciding the remedial actions, such as planning the vaccination distribution and enforcing lockdown or movement control order. Lockdowns and movement control orders have been introduced worldwide as an effort to curb the spread of the virus by cutting offsocial contacts. However, studies have shown that these moves caused emotional distress within society [9, 10]. The mass adoption of ERS can help to identify signs of emotional distress;thereby, interventions and support can be offered. ERS can even be applied to help in regulating mood.Hence, the research of ERS is significant towards the wellbeing of society.
Nevertheless, the development of ERS is increasingly difficult. The development of ERS must adhere to four main principles: (1) an ERS is built on the foundation of some learning algorithms (machine or deep), (2) these algorithms need to be trained on labelled datasets of people expressing various emotions, (3) the generation of such datasets requires a tool that capable of robustly eliciting various emotions, and (4) the emotional stimuli are compiled as a dataset. In this study, we tackle the fourth principle, which is developing a dataset for emotional stimuli in order to elicit the emotions of participants and building emotion database to train the learning algorithm. The videos dataset proposed in this paper will serve as the emotion elicitation tool.
Prior to choosing the emotion elicitation stimuli, several fundamental problems must be considered; the emotion model used to categorize the stimuli and ensure the distribution of data for each emotion category, the format of the stimuli, source and origin of the stimuli, and stimuli validation method. Two commonly used emotion models in emotion studies are the discrete emotion model and the dimensional emotion model. Paul Ekman, a psychologist, proposed the discrete emotion model[11]. It is used to identify and label six primary emotional states: happiness, sadness, fear, anger, surprise, and disgust. On the other hand, Russel [12]introduced a dimensional emotion model (Fig. 1), which includes valence that refers to pleasantness, whether a person likes or dislikes something, and arousal, which refers to the intensity of a person’s feelings, high or low. The dimensional emotion model was employed in this study due to the facilities offered in labeling a person's emotional states and known for its easy-to-operate methods, as reported in [2, 1315].

Fig. 1. Illustration of Russel’s dimensional emotion model [12].

The second fundamental problem to be considered is the emotion elicitation stimuli format. Emotions can be elicited using various types of stimuli, such as audio-visual/videos[2, 1618], visual/static images[19, 20], audio [21], and the sense of smell [22]. Among these methods, audio-visual stimuli are the most frequent format used to induce different emotional states in the studies [23]. The audio-visual stimuli have proven to affect human emotions in many situations and are effective to evoke human emotions [24, 25]. Additionally, they have been reported to cause strong and persistent emotional states over time compared to other stimuli[17]. Therefore, audio-visual stimuli were proposedin this study.
The third fundamental problem is the emotion elicitation stimuli source and origin. Researchers commonly used audio-visual stimuli obtained from news, TV shows, or film excerpts to elicit emotionMany sets of audio-visual stimuli were established in recent years [14, 16, 26, 27]. However, to the best of the researchers’ knowledge, existing stimulisets mainly consist of Latin[17]and Hollywood [26, 28] film excerpts. The Asian stimuli sets are reported in [29, 30]; however, these sets consist only of Chinese clips and do not cover other diverse Asian regions,for instance,Southeast Asia, South Asia, and even East Asia. The stimulisetsthat include Asian clips from various other Asian countries such as Malaysia, Thailand, China,and South Koreaare still insufficient. Therefore, this study aims to tackle the gap by including several clips from various Asian countries as Asian clips are nonexistent in the researchers’new set of stimuli. The origins of the videos wereselected by consideringthe participants’ familiarity and background.
The final fundamental problem to be considered is the stimuli validation method. Traditionally, face-to-face interviews and physical lab sessions, where participants watch the selected videos and were interviewed by the researchers on the type of emotion induced by the videos,were conducted to validate the video selection [2, 27, 3037]. Nevertheless, the recent COVID-19 pandemic hascausesthe enforcement of lockdown or movement control that cutoff the social contacts in many countries, including Malaysia. The movement control order has been enforced in Malaysia since March 18, 2020, alternating between the relaxed movement control order and stricter conditional movement control order, based on the number of cases. Many individuals and industries have been affected by this pandemic, as has the research of ERS.As traditional methodsare discouraged in pandemic situations still prevalent, an online survey was designed and conducted to address this issue because the face-to-face interaction or physical contact between a subject and a researcher is inessential. Moreover, several studies proved that online surveys could record emotional responses from a subject and successfully validate the selected stimuli [17, 38, 39]. Apart from addressing the problems resulting from the pandemic, conducting online surveys also contribute to emotion elicitation in the wild (i.e., in real-world situations). It can evoke emotions in diverse, more natural, and realistic situations as the participants fill out the survey forms at the comfort of their own home and during their leisure time.This type of study with a more natural environment than the laboratory setting is still scarce. Nonetheless, studies on the development of ERS in the wild or real-world scenarios are gaining momentum [4043].
Overall, this paper identifies and validates a new set of stimuli expected to elicit specific emotional reactions in a two-dimensional model of valence and arousal. Twenty-four videos from various Asian countries as well as Western countries were selected as the stimulus material. Besides, the impact of demographic dependents (i.e., age, gender, race, nationality, and place of residence) towards people’s perspectives and emotions are examined too. Theoutcome of this study wasexpected to obtain a new set of stimuli that can induce and measure dimensional emotions, especially Asians.
The remainder of this paper is compiled as follows: Section 2 presents an overview of related works. Section 3 describes the method of stimulus selection, details of participants, instruments used in the study, and research procedures for emotional elicitation. The methods andresults of the analysis are presented in Section 4. Section 5 covers the study’s implications, and some limitations are addressed with recommendations for future works. Finally, conclusions are drawn in Section 6.

Related Work

Recently, researchers have established many sets of stimuli. A fewof the sets that contain audio-visual stimuli from various types of sources are summarized in Table 1. Each of the datasets includes a various number of stimuli. The highest number of stimuli is 800 from [39], followed by [14] and [26], with the number of stimuli, 360 and 295, respectively. In contrast, [29] and [16] have the lowest number of stimuli but can still successfully perform emotion elicitation.
Most of the datasets used stimuli from the Western sources [23, 27, 44]. However, it is observed that two sets [29, 30] used stimuli of the Asian origins. However, these sets are limited to Chinese clips only. Several researchers have reported and emphasized the importance of cultural differences in conducting emotional elicitation studies as individuals from different ethnicities react differently towards certain stimulus materials [16]. The importance of controlling cultural differences is also expressed in [45] and [38], where although generally consistent emotions have beenfound, some inconsistencies are observed. The inconsistencies are caused by the language used for dubbing or subtitles in films, disrupting the emotional elicitation process and failure in evoking emotion. These findings suggest that cultural difference is an essential factor to be considered in emotion studies.

Table 1. The summary of stimulus materials datasets.

Study, year Stimuli format Number of stimuli Stimuli source Origin Environment
Schaefer et al. [23], 2010 Audio-visual 70 Film clips Western Controlled
Koelstra et al. [27], 2012 Audio-visual 40 Music video clips Western Controlled
Carvalho et al. [44], 2012 Audio-visual 52 Film clips Western Controlled
Nelson-Field et al. [39], 2013 Audio-visual 800 Commercial and non-commercial clips Western Wild
Gabert-Quillen et al. [16], 2015 Audio-visual 18 Film clips Western Controlled
Zheng and Lu [29], 2015 Audio-visual 15 Film clips Asian
(Chinese only)
Gilman et al. [26], 2017 Audio-visual 295 Film clips, documentaries, internet programs, and personal videos Western Controlled
Song et al. [30], 2019 Audio-visual 28 Film clips, TV news and TV shows Asian
(Chinese Only)
Michelini et al. [17], 2019 Audio-visual 28 Film clips Western and Latin Wild
Di Crosta et al. [14], 2020 Audio-visual 360 Videos of actions (POV) Western Controlled
Our dataset Audio-visual 24 Film clips, documentaries, TV shows, commercials Various countries (Western, Thailand, Korea, Malaysia, China) Wild
Other than cultural differences, it is essential to consider other factors that may affect the emotion elicitation process, such as age and gender. For instance, [45] and [46]havepresented the impact of a person’s age on emotion (valence/arousal). The authors havefound that the older the person is, the higher the valence/arousal is felt. Besides, the influence of the gender differences towards emotional states wasproved by Uhrig et al. [25]. The results haverevealed that women react with higher intensity (arousal) compared to men for both spectra of valence, either positive or negative.
Most of the research on emotion elicitation focused on conducting experiments in a lab-controlled environment except for [17] and [39]. These two studies conducted their experiments in the wild by utilizing an online validation method, where participants can partake from anywhere. Besides the claims from the authors in[47], an online validation study can elicit emotions and obtain similar results as a laboratory environment experiment. This method is suitable for collecting data during outbreaksbecause of contact-free and social distancing is ensured. Moreover, the online validation method offers relatively fast data acquisition and is advantageous in the sample size and population accessibility. Data management can also be performedeffortlessly as the data is automatically stored in secured cloud storage once participants submit the survey forms. The data integrity can also be guaranteed as no changes to the survey forms are allowed after submission.


In this work, 24 audio-visual stimuli encompassingclips from various origins such as Western countries, Thailand, Korea, Malaysia, and China wereproposed. These clips are suitable to induce the emotion of the multi-racial and multi-national participants due to their background and familiarity with the origins of these clips. The impact of demographic dependents: age, gender, race, nationality, and place of residence, on people’s perspectives and emotions wasstudied. An online survey wasemployedas the medium for data collection in this study. No lab session was conducted.

Stimuli and Design
In this study, three research assistants selected excerpts from several sources,namely movies, documentaries, TV shows, andcommercials from various origins, including Asian countries. The research assistants selectedthe videos based on three criteriaas stated in [48]: (1) the length of the videos should not be too long to avoid participants feel tired or bored and to avoid multiple emotions occurring within the same stimulus but long enough to convey the content and elicit the intended emotion, (2) the videos must be understood without further explanation, and (3) the videos must be able to elicit targeted emotion based on the dimensional emotion model of valence and arousal.
Twenty-four film excerpts with a duration ranging from 1 minute to 5 minutes were shortlisted. These clips were retrieved from YouTube and expected to induce specific emotional reactions in a two-dimensional model of valence and arousal.
All 24 clips are new emotion elicitation stimuli and not from existing datasets. The aim was to adapt the new datasets to the participants of this study, who are mostly Asians, to evoke the desired emotions effectively. Each video was given its respective identification number,and the playlist of the videos was carefully arranged in the online form such that (a) two videos targeting the same emotion were not shown continuously[23], (b)participants did not watch more than two videos with the same valence and arousal quadrants consecutively [16]. Table 2 showsa summary of the selected videos.

Studies with more than 30 subjects and a balanced number of genders provide better generalization accuracy than studies with fewer and unbalanced subjects [13]. However, getting many participants became challenging, especially when the COVID-19 pandemichits and Malaysian authorities enforced the movement control order in Malaysia. Therefore, this study adopted the online survey method.
The survey formsweredistributed via emails and social media platforms for three months, from April 7, 2020 to June 20, 2020. A total of 42 participants took partin the study, consisted of 22 women and 20 men, and in the age range of between 18 and 45. The participants were classified into two age groups: 18–25 and 26–45. Of the 42 participants, 28 lived in urban areas while the rest lived in rural areas. Next, 88.1% of the participants were Malaysians, while the remaining 11.9% were non-Malaysian citizens. The participants consisted of four different races,namely Malay (67%), Chinese (21%), Bengali (Bangladesh) (7%), and Bamar (Myanmar) (5%). The effect of demographic dependence on the participants’ emotions while watching the emotion elicitation materials was explored, and the resultsare presentedin Section 4.

Table 2. Information of the selected videos
ID Source Description Duration (min) Expected emotion Expected quadrant in VA scale
1 China insurance Advertisement
A story about the life journey of a single father who is willing to do any job to raise his daughter. 4 Happy I
2 Thailand insurance Advertisement
A story about a guy who always helps anyone without expecting anything in return. 3.04 Happy I
3 Zombie Parkour POV
A group of people running away from the zombie attack. 3.23 Scary II
4 Tangled Movie
A scene where Rapunzel feels worried about escaping the tower, but at the same time, she is excited to be out and see the world. 2.31 Happy I
5 3 years old kid arguing with mom
A video of a 3-year-old kid cutely arguing with his mother. 2.34 Amuse I
6 Rain video
Rain video with soothing music. 1.19 Relaxing IV
7 Road CCTV Footage
CCTV footage of the busy road. 1.2 Bored III
8 Moana Movie
A scene of little Moana plays at the seaside. 2.44 Happy I
9 Frozen Movie
A scene of Anna and Elsa's parents died, and Anna felt ignored by Elsa. 3.2 Sad III
10 Joker Movie
Joker kills a man that comes to his house. 4.21 Disturbing II
11 Harry Potter and the Deathly Hallows Part 2 Movie
A scene where everyone chants a spell to protect Hogwarts. 2 Excited I
12 Thailand insurance Advertisement
A story about a girl who is embarrassed about having a deaf and mute father. 3.01 Sad III
13 The Conjuring Movie
A part of the exorcism scene. 1.15 Scary II
14 Frozen Movie
A scene of little Anna and Elsa play together. 1.2 Happy I
15 Spirit Movie
A beginning scene of the movie showing life in the jungle. 4.39 Relaxing IV
16 Jurassic World: Fallen Kingdom Movie
A scene of dinosaurs eating the workers in the sea and chasing those who run away in a helicopter. 3.53 Fear II
17 Malaysia pencil color Advertisement
A story of a friendship between two classmates. 2.46 Happy I
18 Thailand food advertisement
A story about a mom searching for her missing daughter. 2.55 Sad III
19 The Lorax Movie
A scene where the owner of the house is shocked after waking up from sleep with animals around his house. 1.23 Amuse I
20 Joker Movie
A scene where Joker kills his mom at the hospital. 2.08 Disturbing II
21 The Purge Movie
A scene where the police patrol around the city during the purge day. 1.55 Fear II
22 Metronome
A video of a metronome. 3 Bored III
23 Tangled Movie
A video of Rapunzel saving Flynn. 3.31 Happy I
24 Video of child abuse https://youtu.be/3H90g34WazE A video of a child abused by a nanny. 1.43 Angry II

The online survey form was divided into two sections. The first section recorded the demographic information of respondents: age, gender, race, nationality, and place of residence (e.g., urban or rural). The study concerning the videos are available on the second part of the form. At the end of each video, the video excerpts were annotated by participants using the dimensional emotion, namely valence and arousal, with a Likert scale [49] from 1 (lowest) to 5 (highest). The dimensional emotion model is simple and has low complexity. Therefore, it can be used for various emotional elicitation and is suitable for researchers to collect responses from non-English speakers and different age ranges.

Since the emotion elicitation was performed during the COVID-19 pandemic, online-based formswere utilized to collect data (demographic and emotion annotations) from participants. The participants were asked to fill out the survey forms from their respective homes in order to comply with the movement control orders set by the authorities.
Without the laboratory-controlled environment, the emotions wereelicited in the wild,such as in the bedroom, kitchen, or living room. More realistic situations (i.e., resting, studying, or working) can evoke more genuine emotions during the survey session.
The survey began by explaining the study’s objective;thereby, the participants have a better understanding of this study. Next, the dimensional emotion model (valence and arousal) concepts weredescribed in detail along with the scale (Fig. 1) as a reference during the survey. The participants were informed that all answers given will be kept strictly confidential and will only be used for legitimate research purposes. Subsequently, the participantshave to self-assess their emotions by reporting the felt valence and arousal while watching the video using the five-point Likert scale at the end of each video. In addition, the participants were reminded to rate their feelings according to what they felt while watchingthat video rather than according to their general mood or what they believed they should feel or expect about the video. The data from the survey is saved to the cloud, Google drive storage. The online survey procedure and the survey form design are illustrated in Fig. 2 and Fig. 3, respectively.
Fig. 2. Overall procedure.

Fig. 3. Example of the survey question.

Data Analysis and Results

Comparative Analysis
The obtained data were screened before performing any statistical procedures, to ensure the data areusable, reliable, and applicable for analysis. No missing values were detected. The Shapiro-Wilk test [50, 51]was used to assess the normality of each video. The significantvalue acquired in this study indicated that the collected data were not distributed normally,with the significance value for all videos is less than 0.05. Since the obtained data was small with a non-normal distribution, non-parametric tests were used to conduct the comparative analysis based on demographic factors. The test was recommended [52]because (1) the result is not easily influenced by outliers and (2) it can handle ordinal data. This study applied the Mann-Whitney U test and the Kruskal-Wallis H test to find significant differences between the five demographic factors (age, gender, place of residence, nationality, and race).
Specifically, the Mann-Whitney U test was used to compare the differences between two demographic groups, such as age (18–25 and 26–45), gender (women and men), citizenship (Malaysian and non-Malaysian), and place of residence (urban and rural). For the race factor,which has four different groups, the Kruskal-Wallis H test was used. This test extends the Mann-Whitney U test to compare more than two separate groups [53].
This analysis is divided into five parts according to the participants’ demographic information: age, gender, place of residence, nationality, and race. The analysis results for age, gender, residence, nationality, and race are depicted in Figs. 4–8, respectively. Interpretations of the non-parametric test results are as follows: (1) if the p-value is greater than the alpha value (α=0.05), do not reject the null hypothesis and accept the significant differences that do not exist between groups, and (2) if the p-value is less than or equal to the alpha value, reject the null hypothesis and conclude that there are significant differences[54].

Age: A Mann-Whitney U test (Fig. 4) indicatesthat Video 17 acquired a greater score of positive valences from participants in the age group of 25–26 (Md = 4) than participants in the age group of18–25 (Md = 4), U = 142, Z = -1.96, p = 0.05. The rest of the videos did not significantly differ either on the valence or arousal score (p>0.05). Therefore, almost all selected videos are age-neutral.

Gender: A Mann-Whitney U test (Fig. 5) reveals no significant difference in all videos between the two gender groups, female and male (p>0.05). This result shows that theselected videos are gender-neutral.

Fig. 4. Mann-Whitney U test results for participants’ age.

Fig. 5. Mann-Whitney U test results for participants’ gender.

Place of residence: A Mann-Whitney U test (Fig. 6) shows significant differences of participants’ residences in the valence score for Video 17 (the mean levels of participants who lived in urban areas and rural areas were 24.21 and 16.07, respectively; U = 120, Z = -2.20, p = 0.03). The rest of the videos did not significantly differ either on the valence or arousal score (p>0.05).

Nationality: A Mann-Whitney U test (Fig. 7) reveals significant differences in several videos between the two nationality groups, Malaysian andnon-Malaysian. Videos that obtainedsignificant differences in the valence score are Video 1 (p = 0.05), Video 5 (p = 0.03), Video 6 (p =0.001), Video 11 (p = 0.04), Video 12 (p = 0.04), and Video 24 (p = 0.01). Videos that acquiredsignificant differences in the arousal score areVideo 6 (p = 0.03), and Video 18 (p = 0.01). Theoutcomes indicate that only Video 6 attaineda significant difference in both valence and arousal scores.

Race: Theparticipants in this study comprise four races: Malay (n = 28), Chinese, (n = 9), Bengali (n = 3), and Bamar (n = 2). Therefore, the Kruskal-Wallis H test was conducted to observe significant differences between these four independent groups. The result in Fig. 8 shows that 11 videos acquired significant differences in terms of different racial groups. Videos that achieved significant differences in the valence score are Video 1 (p = 0.07), Video 4 (p = 0.05), Video 6 (p = 0.01), Video 7 (p = 0.02), Video 9 (p = 0.05), Video 12 (p = 0.01), Video 15 (p = 0.05), Video 16 (p = 0.05), Video 18 (p = 0.03), Video 19 (p = 0.006), and Video 22 (p = 0.04). However, only Video 18 acquired significant differences in the arousal score (p=0.03),which also indicates only Video 18 achieves significant differences in both valence and arousal scores.

Overall, non-parametric tests (Mann-Whitney U and Kruskal-Wallis H) showed age, place of residence, nationality, and race induced different emotions for a few films. These findings support the previous literature [16, 38, 45], which asserts that culture and environment can influence a person’s emotions. Interestingly, the two genders show no emotional differences for all videos in this study’s dataset, makingit a gender-neutral dataset.

Fig. 6. Mann-Whitney U test results for participants’ place of residence.

Fig. 7. Mann-Whitney U test results for participants’ nationality.

Fig. 8. Kruskal-Wallis H test results for participants’ race.

Self-Assessment Analysis
In this section, the videos that successfully evoked the targeted emotionsare discussed. Next, the analysis isdescribed in more detail using the valence and arousal scales (i.e., self-assessment analysis) and descriptive statistics. Fig. 9 presents the scatter plots for the targeted emotions and the location of participants’mean assessments for each video on the valence-arousal scale.

Fig. 9. (a)Targeted emotions of each video and (b) participants’ mean rating location.

Overall, the result shows that 79% of the videos successfully elicit the targeted emotional reactions in the two-dimensional model of valence and arousal. Of the 24 videos, 19 were found to occupy the same quadrant as the targeted emotions, proving the effectiveness of these videos in evoking the targeted emotions. While the remaining videos (5, 6, 9, 12, 18) are located in different quadrants.

Video 5 was expected to elicit the amusement emotion, which is in the first quadrant. However, the observation of the mean location for this video shows that it is located between the second and third quadrants. The significant differences between nationalities contributed to this result based on the performed Mann Whitney U test.

Video 6 wasexpected to evoke calmness and relaxing emotions. However, this video failed to evoke those emotions because, based on the participants’ ratings, most of them felt the negative valence emotions (i.e., sad or bored) while watching the video. The Mann-Whitney U test for nationality and Kruskal-Wallis H test for race reported significant differences for this video.

Video 9 is an excerpt from the “Frozen” movie that highlights the sad emotions in which Anna feels ignored by Elsa and when they receive news of their parent’s death. However, based on the mean location of this video, it even evokes positive emotions in the first quadrant. The Kruskal-Wallis H test shows that race differences led to this observation.

Videos 12 and 18 each tell a story of love between parents and their children. The storyline of these two videos wasexpected to evoke sad emotions (third quadrant). Hence, Fig. 9 clearly shows thatboth videos are located in the second quadrant based on the participants’ ratings. Similar tothe above-discussed videos, the Mann-Whitney U and Kruskal-Wallis H tests showed that the differences between their nationality and race contributed to this finding.

The diversity of opinions and perspectives allows the difference between the expected emotion and the actual emotion felt by the participants to occur. Furthermore, factors such asnationality and race that are closely related to culture and environment are probably among thereasons for these five videos’ misclassification. Additionally, Gabert-Quillen et al. [16] pointed out familiarity with the stimulus can influence emotional evaluation. Participants tend to react positively to the videos they have watched in the past. For example, Video 9, taken from popular sources, reported positive emotions instead of the targeted sad emotion.
Moreover, the application ofa dimensional emotion model may haveconfusedsome participants due to misunderstandings of terminologies,leading toerrors in annotations of valence and arousal. Discrete emotional keywords may provide a better classification of emotions based on the standard emotion terms (e.g., happy, sad, or angry) [27]. Besides, since thestudy tookup to 60 minutes,the participants may become tired and unable to give full attention to the videos; thus, leadsto biased towards specific ratings [2, 55].


Implications of the Study
This study proposed a new set of stimulus material that includes Asian clips to suit theparticipants of various Asian ethnicities and nationalities in eliciting human emotion. The results proved that the proposed videos effectively evokedthe targeted emotions.
Furthermore, thefindings of this researchsupport theclaim by Kiriya et al.[47] that an online validation study can elicit emotions and obtain similar results as thelaboratory-controlled environment experiment. Not only that, the method employedin this study was provedto be effective and viable to collect valuable data during a pandemic. It successfully capturedauthentic emotions in an uncontrolled environment (i.e., in the wild environment).
Gender differences in emotional studies have been explored previously. Interestingly, the findings of this study showed no significant differences between the two genders (women and men) on the stimuli used. These outcomes indirectly contributed to the proposal of a set of gender-neutral videos that can be used by researchers who focus on gender-related content when selecting stimulus materials for emotion elicitation experiments.
In addition, this study confirms the outcomes of previous studies [16, 38, 45] in which different cultural environments affect the subject’s perspective and emotions. It isalso found that race and nationality are vital factors that affect human emotions. These two factors are closely related to the existence of cultural differences such that to some extent, they change the human perspective in viewingsomething. This difference can be seen in the Mann-Whitney U and Kruskal-Wallis H tests for the race and nationality analyses,as presented in the previous section.
Overall, this studysuccessfullyidentified and validated 24 videos, including videos ofthe Asian origins, as a stimulus for evoking the desired emotions. Furthermore, theset of stimulus material of this study is also suitable for eliciting emotions to suit theparticipants from Asia.

Limitations and Future Works
Some limitations were found throughout the study and should be addressed to achieve better results. The first limitation is the use of videos directly from YouTube. The video’s title probably primes the emotion of the participants before theywatch the video. This factor affects the participants’ emotional state during the study and compromises the video's credibility in evoking the targeted emotions to be criticized. To address this problem, thevideos should not have any titles or captions to let the pure emotions areelicited without being influenced by these factors.
The second limitation is based on the participants’ feedback; some of them found that it was not easy to understand the meaning of valence and arousal used in the survey. Nonetheless, the discrete emotion keywords in Fig.1 helped them during the survey. This feedback indicates that despite the positive claim, the dimensional emotion model has disadvantages because participants are more familiar with the discrete emotion keywords such as happy, sad, or angry. Therefore, the use of discrete emotional keywords in emotional studies should be considered.
The third limitation is the one-hour survey period, which may havecontributed to a low number of participants. In addition, thelong survey period causedparticipants to become tired and lose focus towards the end of the survey;thus, leadsto biased towards specific ratings. According to previous studies, the average attention span (i.e., the amount of time an individual can focus continuously on a task without interruption) is between 20 and30 minutes [55]. In the traditional approaches,the participants areallowed to rest in the middle of the survey and carry over after the break. Although it is difficult to implementin online surveys, this break time should be considered in future studies to reduce fatigue and keep participants focused. It issuggested thatan online survey should be accompanied by a timestamp recording for each video suchthat researchers can detect whether the participants take a break or not. Thus, the effect of resting in the middle of the survey could also be explored in future studies.
Finally, due to the social restrictions set to curb the COVID-19 pandemic, the researchers were unableto validate the videos used in the study with experts. Instead, the researchersthemselves fully determine the targeted emotions. Thischoice mightbe a problem, as the researchers’emotional annotations might be less precise than the actual emotional annotations. Therefore, expert validation for each video should be addressed in future work.
In the future, more engaging stimuli materials such as computer games should be considered. Computer games are known to effect the cognition and emotion state of the players [56, 57]. Additionally, the authors of [58], suggested that dynamic changes of visual appearances of movie clips like the brightness, colors and contrasts can further simulate and intensify the emotion of viewers. This could contribute to producing a better set of stimuli and should be investigated.


In conclusion, the objective of this pilot study is to identify and validate a new set of video stimuli, including clips from Asian countries. The stimuliwereexpected to elicit specific emotional reactions in a two-dimensional model of valence and arousal. The study was conducted during the COVID-19 pandemic. Since the traditional elicitation method (physical lab session) is not encouraged due to the movement control order set by the government, 24 videos were shown to 42 participants through online surveys. Seventy-nine percent of the videos successfully evoked the targeted emotions. This result indicates that the selected videos are a good set of stimuli even in thewild environment or uncontrolled environment.
Additionally, this study successfully identified a set of gender-neutral videos that could be used in inducing specific emotions. The results of demographic analysis emphasize the significanceof considering factors such as age, gender, race, nationality, and residence in the study of human emotions because these factors were found to affect human emotions and perspectives. This new set of stimuli is believed to serve as a helpful resource in further enhancing emotional studies’ growth, mainly focusing on Asian participants. The data from this studyare available upon request in agreement with the participants.


The authors would like to thank the participants who participated in this experiment.

Author’s Contributions

Conceptualization, SNMSI, NAAA, SZI. Funding acquisition, NAAA. Investigation and methodology, SNMSI, CTK, MAR. Supervision, NAAA, SZI. Writing of the original draft, SNMSI. Writing of the review and editing, SNMSI, NAAA, SZI.Data Curation, SNMSI, CTK.


This project is funded by TM Research & Development Grant (No. RDTC/190988) which is awarded to the Multimedia University.

Competing Interests

The authors declare that they have no competing interests.

Author Biography

Affiliation: Faculty of Information Science & Technology, Multimedia University, Melaka, Malaysia
Biography: Sharifah Noor Masidayu Sayed Ismail received the B.IT. degree in data communication and networking from Multimedia University, Melaka, Malaysia, in 2018. She is currently pursuing the M.Sc. degree in information technology at the same university. Her research interest includes affective computing, signal processing, and machine learning.

Affiliation: Faculty of Engineering & Technology, Multimedia University, Melaka, Malaysia
Biography: Nor Azlina Ab Aziz graduated with B.Eng.(Hons) Electronics majoring in Computer and M.Eng.Sc from Multimedia University, Malaysia and Ph.D. degree from University of Malaya, Malaysia. Both her doctoral and master research works were in the field of computational intelligence. She is currently a senior lecturer in the Faculty of Engineering & Technology, Multimedia University, Melaka, Malaysia. She is also the chairperson for Multimedia University’s Centre for Engineering Computational Intelligence. She is one of the inventors of SAFIRO and simulated Kalman filter (SKF) algorithm, which are optimization algorithms based on estimator framework. Her research interest includes the fundamental aspects and applications of computational intelligence in engineering.

Affiliation: Faculty of Information Science & Technology, Multimedia University, Melaka, Malaysia
Biography: SitiZainab Ibrahim received B.Eng.(Hons) Electronics majoring in Computer from Multimedia University, Malaysia. She obtained her Master in Information Technology from Melbourne University, Australia. She is currently pursuing PhD at UniversitiTeknologi Malaysia. Prior to that, she spent few years as a research scholar at University College London Interaction Center, University College London, United Kingdom. She is currently the co-chairperson of Center for Intelligent Cloud Computing of Multimedia University. Her research interest includes affective computing, connected vehicles, and privacy calculus phenomenon in social media.

Affiliation: Faculty of Engineering & Technology, Multimedia University, Melaka, Malaysia
Biography: Chy Mohammed Tawsif Khan received B.Sc. in Computer Science and Engineering from International Islamic University Chittagong (IIUC), Bangladesh and M. Eng. Sc. in Soft Computing from Multimedia University (MMU), Malaysia. Currently, he is pursuing Ph.D. in Emotion Recognition System using Deep Learning from ECG and PPG data. His research interest includes Machine Learning, Data Analytics and IoT for real life applications.

Affiliation: Faculty of Engineering & Technology, Multimedia University, Melaka, Malaysia
Biography: Md. Armanur Rahman received B.Sc. degree in computer science and engineering from Asian University of Bangladesh (AUB) and M.Eng.Sc. in Big Data and Machine Learning from Multimedia University (MMU), Malaysia. Now he is perusing Ph.D. in Facial Emotion Recognition using Deep Learning at Multimedia University (MMU), Malaysia. His research interest includes image processing, performance optimization of big data system, database management and data mining.


[1] A. Ghali and M. Kurdy, “Emotion recognition using facial expression analysis,” Journal of Theoretical and Applied Information Technology, vol. 96, no. 18, pp. 6117-6129, 2018.
[2] S. Katsigiannis and N. Ramzan, “DREAMER: a database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices,” IEEE Journal of Biomedical and Health Informatics, vol. 22, no. 1, pp. 98-107, 2018.
[3] D. Pollreisz and N. TaheriNejad, “A simple algorithm for emotion recognition, using physiological signals of a smart watch,” in Proceedings of 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Korea, 2017, pp. 2353-2356.
[4] R. Rakshit, V. R., Reddy, and P. Deshpande, “Emotion detection and recognition using HRV features derived from photoplethysmogram signals,” in Proceedings of the 2nd workshop on Emotion Representations and Modelling for Companion Systems, Tokyo, Japan, 2016, pp. 1-6.
[5] T. A. Khan, S. Abbas, A. Ditta, M. A., Khan, H. Alquhayz, A. Fatima, and M. F. Khan, “IoMT-based smart monitoring hierarchical fuzzy inference system for diagnosis of covid-19,” Computers, Materials and Continua, vol. 65, no. 3, pp. 2591-2605, 2020.
[6] M. Shorfuzzaman and M. Masud, “On the detection of covid-19 from chest x-ray images using CNN-based transfer learning,” CMC-Computers Materials & Continua, vol. 64, no. 2, pp. 1359-1381, 2020.
[7] A. E. Azzaoui, S. K. Singh, and J. H. Park, “SNS big data analysis framework for COVID-19 outbreak prediction in smart healthy city,” Sustainable Cities and Society, vol. 71, article no. 102993, 2021. https://doi.org/10.1016/j.scs.2021.102993
[8] M. A. Khan, S. Abbas, K. M. Khan, M. A. Al Ghamdi, and A. Rehman, “Intelligent forecasting model of COVID-19 novel coronavirus outbreak empowered with deep extreme learning machine,” Computers, Materials & Continua, vol. 64, no. 3, pp. 1329-1342, 2020.
[9] J. S. Novotny, J. P. Gonzalez-Rivas, S. Kunzova, M. Skladana, A. Pospisilova, A. Polcrova, J. R. Medina-Inojosa, F. Lopez-Jimenez, Y. E. Geda, and G. B. Stokin, “Risk factors underlying COVID-19 lockdown-induced mental distress,” Frontiers in Psychiatry, vol. 11, article no. 603014, 2020. https://doi.org/10.3389/fpsyt.2020.603014
[10] S. Singh, M. D. Roy, K. Sinha, S. Parveen, G. Sharma, and G. Joshi, “Impact of COVID-19 and lockdown on mental health of children and adolescents: a narrative review with recommendations,” Psychiatry Research, vol. 293, article no. 113429, 2020. https://doi.org/10.1016/j.psychres.2020.113429
[11] P. Ekman, “An argument for basic emotions,” Cognition & Emotion, vol. 6, no. 3-4, pp. 169-200, 1992.
[12] J. A. Russell, “A circumplex model of affect,” Journal of Personality and Social Psychology, vol. 39, no. 6, pp. 1161-1178, 1980.
[13] A. F. Bulagang, N. G. Weng, J. Mountstephens, and J. Teo, “A review of recent approaches for emotion classification using electrocardiography and electrodermography signals,” Informatics in Medicine Unlocked, vol. 20, article no. 100363, 2020. https://doi.org/10.1016/j.imu.2020.100363
[14] A. Di Crosta, P. La Malva, C. Manna, A. Marin, R. Palumbo, M. C. Verrocchio, M. Cortini, N. Mammarella, and A. Di Domenico, “The Chieti Affective Action Videos database, a resource for the study of emotions in psychology,” Scientific Data, vol. 7, article no. 32, 2020. https://doi.org/10.1038/s41597-020-0366-1
[15] J. M. Garcia-Garcia, V. M. Penichet, and M. D. Lozano, “Emotion detection: a technology review,” in Proceedings of the XVIII International Conference on Human Computer Interaction, Cancun, Mexico, 2017, pp. 1-8.
[16] C. A. Gabert-Quillen, E. E. Bartolini, B. T. Abravanel, and C. A. Sanislow, “Ratings for emotion film clips,” Behavior Research Methods, vol. 47, no. 3, pp. 773-787, 2015.
[17] Y. Michelini, I. Acuna, J. I. Guzman, and J. C. Godoy, “LATEMO-E: a film database to elicit discrete emotions and evaluate emotional dimensions in Latin-Americans,” Trends in Psychology, vol. 27, pp. 473-490, 2019.
[18] Z. Tong, X. Chen, Z. He, K. Tong, Z. Fang, and X. Wang, “Emotion recognition based on photoplethysmogram and electroencephalogram,” in Proceedings of 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 2018, pp. 402-407.
[19] R. M. Mehmood and H. J. Lee, “A novel feature extraction method based on late positive potential for emotion recognition in human brain signal patterns,” Computers & Electrical Engineering, vol. 53, pp. 444-457, 2016.
[20] G. Udovicic, J. Derek, M. Russo, and M. Sikora, “Wearable emotion recognition system based on GSR and PPG signals,” in Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care, Mountain View, CA, 2017, pp. 53-59.
[21] R. Ramirez and Z. Vamvakousis, “Detecting emotion from EEG signals using the emotive epoc device,” in Brain Informatics. Heidelberg, Germany: Springer, 2012, pp. 175-184
[22] J. Luaute, A. Dubois, L. Heine, C. Guironnet, A. Juliat, V. Gaveau, B. Tillmann, and F. Perrin, “Electrodermal reactivity to emotional stimuli in healthy subjects and patients with disorders of consciousness,” Annals of Physical and Rehabilitation Medicine, vol. 61, no. 6, pp. 401-406, 2018.
[23] A. Schaefer, F. Nils, X. Sanchez, and P. Philippot, “Assessing the effectiveness of a large database of emotion-eliciting films: a new tool for emotion researchers,” Cognition and Emotion, vol. 24, no. 7, pp. 1153-1172, 2010.
[24] S. Siddharth, T. P. Jung, and T. J. Sejnowski, “Utilizing deep learning towards multi-modal bio-sensing and vision-based affective computing,” IEEE Transactions on Affective Computing, 2019. https://doi.org/10.1109/TAFFC.2019.2916015
[25] M. K. Uhrig, N. Trautmann, U. Baumgartner, R. D. Treede, F. Henrich, W. Hiller, and S. Marschall, “Emotion elicitation: a comparison of pictures and films,” Frontiers in Psychology, vol. 7, article no. 180, 2016. https://doi.org/10.3389/fpsyg.2016.00180
[26] T. L. Gilman, R. Shaheen, K. M. Nylocks, D. Halachoff, J. Chapman, J. J. Flynn, L. M. Matt, and K. G. Coifman, “A film set for the elicitation of emotion in research: a comprehensive catalog derived from four decades of investigation,” Behavior Research Methods, vol. 49, no. 6, pp. 2061-2082, 2017.
[27] S. Koelstra, C. Muhl, M. Soleymani, J. S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: a database for emotion analysis; using physiological signals,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18-31, 2012.
[28] J. J. Gross and R. W. Levenson, “Emotion elicitation using films,” Cognition & Emotion, vol. 9, no. 1, pp. 87-108, 1995.
[29] W. L. Zheng and B. L. Lu, “Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks,” IEEE Transactions on Autonomous Mental Development, vol. 7, no. 3, pp. 162-175, 2015.
[30] T. Song, W. Zheng, C. Lu, Y. Zong, X. Zhang, and Z. Cui, “MPED: a multi-modal physiological emotion database for discrete emotion recognition,” IEEE Access, vol. 7, pp. 12177-12191, 2019.
[31] P. Lakhan, N. Banluesombatkul, V. Changniam, R. Dhithijaiyratn, P. Leelaarporn, E. Boonchieng, S. Hompoonsup, and T. Wilaiprasitporn, “Consumer grade brain sensing for emotion recognition,” IEEE Sensors Journal, vol. 19, no. 21, pp. 9896-9907, 2019.
[32] V. M. Joshi and R. B. Ghongade, R. B. (2020). IDEA: Intellect database for emotion analysis using EEG signal. Journal of King Saud University-Computer and Information Sciences, 2020. https://doi.org/10.1016/j.jksuci.2020.10.007
[33] A. Raheel, M. Majid, and S. M. Anwar, “DEAR-MULSEMEDIA: dataset for emotion analysis and recognition in response to multiple sensorial media,” Information Fusion, vol. 65, pp. 37-49, 2021.
[34] Y. L. Hsu, J. S. Wang, W. C. Chiang, and C. H. Hung, “Automatic ECG-based emotion recognition in music listening,” IEEE Transactions on Affective Computing, vol. 11, no. 1, pp. 85-99, 2020.
[35] M. K. Abadi, R. Subramanian, S. M. Kia, P. Avesani, I. Patras, and N. Sebe, “DECAF: MEG-based multimodal database for decoding affective physiological responses,” IEEE Transactions on Affective Computing, vol. 6, no. 3, pp. 209-222, 2015.
[36] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multimodal database for affect recognition and implicit tagging,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 42-55, 2012.
[37] R. Subramanian, J. Wache, M. K. Abadi, R. L. Vieriu, S. Winkler, and N. Sebe, “ASCERTAIN: emotion and personality recognition using commercial sensors,” IEEE Transactions on Affective Computing, vol. 9, no. 2, pp. 147-160, 2018.
[38] S. Alghowinem, S. Alghuwinem, M. Alshehri, A. Al-Wabil, R. Goecke, and M. Wagner, “Design of an emotion elicitation framework for Arabic speakers,” in Human-Computer Interaction. Cham, Switzerland: Springer, 2014, pp. 717-728.
[39] K. Nelson-Field, E. Riebe, and K. Newstead, “The emotions that drive viral video,” Australasian Marketing Journal, vol. 21, no. 4, pp. 205-211, 2013.
[40] J. A. Healey, “Affect detection in the real world: recording and processing physiological signals,” in Proceedings of 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, Amsterdam, Netherlands, 2009, pp. 1-6.
[41] J. Chen, C. Wang, K. Wang, C. Yin, C. Zhao, T. Xu,X. Zhang, Z. Huang, M. Liu, and T. Yang, “HEU Emotion: a large-scale database for multimodal emotion recognition in the wild,” Neural Computing and Applications, vol. 33, pp. 8669-8685, 2021. https://doi.org/10.1007/s00521-020-05616-w
[42] A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Acted facial expressions in the wild database,” Australian National University, Canberra, Australia, Technical Report TR-CS-11-02, 2011.
[43] X. Peng, Z. Xia, L. Li, and X. Feng, “Towards facial expression recognition in the wild: a new database and deep recognition system,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops,Las Vegas, NV, 2016, pp. 1544-1550.
[44] S. Carvalho, J. Leite, S. Galdo-Alvarez, and O. F. Gonçalves, “The emotional movie database (EMDB): a self-report and psychophysiological study,” Applied Psychophysiology and Biofeedback, vol. 37, no. 4, pp. 279-294, 2012.
[45] W. Sato, M. Noguchi, and S. Yoshikawa, “Emotion elicitation effect of films in a Japanese sample,” Social Behavior and Personality: An International Journal, vol. 35, no. 7, pp. 863-874, 2007.
[46] D. Hazer, X. Ma, S. Rukavina, S. Gruss, S. Walter, and H. C. Traue, “Emotion elicitation using film clips: effect of age groups on movie choice and emotion rating,” in HCI International 2015 - Posters’ Extended Abstracts. Cham, Switzerland: Springer, 2015, pp. 110-116.
[47] J. Kiriya, P. Edwards, and I. Roberts, “Effect of emotional content on online video sharing among health care professionals and researchers (DIFFUSION): results and lessons learnt from a randomised controlled trial,” BMJ Open, vol. 8, article no. e019419, 2018. https://doi.org/10.1136/bmjopen-2017-019419
[48] J. Y. Zhu, W. L. Zheng, Y. Peng, R. N. Duan, and B. L. Lu, “EEG-based emotion recognition using discriminative graph regularized extreme learning machine,” in Proceedings of 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, China, 2014, pp. 525-532.
[49] R. Likert, “A technique for the measurement of attitudes,” Archives of Psychology, vol. 22, no. 140, pp. 5-55, 1932.
[50] S. Shaphiro and M. Wilk, “An analysis of variance test for normality,” Biometrika, vol. 52, no. 3, pp. 591-611, 1965.
[51] S. Glen, “Shapiro-Wilk test: what it is and how to run it,” 2014 [Online]. Available: https://www.statisticshowto.com/shapiro-wilk-test/.
[52] J. Frost, “Nonparametric tests vs. parametric tests,” 2020 [Online]. Available: https://statisticsbyjim.com/hypothesis-testing/nonparametric-parametric-tests/.
[53] E. Huizingh, “Non-parametric tests,” in Applied Statistics with SPSS. Thousand Oaks, CA: Sage Publications, 2012, pp. 319-345.
[54] A. Ogee, M. Ellis, B. Scibilia, and C. Pammer, “What can you say when your p-value is greater than 0.05?,” 2015 [Online]. Available: https://blog.minitab.com/en/understanding-statistics/what-can-you-say-when-your-p-value-is-greater-than-005.
[55] A. Kohn, “Brain science : focus – Can you pay attention ?,” 2014 [Online]. Available: https://learningsolutionsmag.com/articles/1440/brain-science-focuscan-you-pay-attention.
[56] G. U. Navalyal and R. D. Gavas, “A dynamic attention assessment and enhancement tool using computer graphics,” Human-centric Computing and Information Sciences, vol. 4, article no. 11, 2014. https://doi.org/10.1186/s13673-014-0011-0
[57] F. Pallavicini, A. Ferrari, and F. Mantovani, “Video games for well-being: a systematic review on the application of computer games for cognitive and emotional training in the adult population,” Frontiers in Psychology, vol. 9, article no. 2127, 2018. https://doi.org/10.3389/fpsyg.2018.02127
[58] D. Kim, Y. Cho, and K. S. Park, “Comparative analysis of affective and physiological responses to emotional movies,” Human-centric Computing and Information Sciences, vol. 8, article no. 15, 2018. https://doi.org/10.1186/s13673-018-0138-5

About this article
Cite this article

Sharifah Noor Masidayu Sayed Ismail1, Nor Azlina Ab. Aziz2,*, Siti Zainab Ibrahim1, ChyTawsif Khan2, and Md. Armanur Rahman2, Selecting Video Stimuli for Emotion Elicitation via Online Survey, Article number: 11:36 (2021) Cite this article 3 Accesses

Download citation
  • Recived19 March 2021
  • Accepted5 September 2021
  • Published30 September 2021
Share this article

Anyone you share the following link with will be able to read this content:

Provided by the Springer Nature SharedIt content-sharing initiative