Human-centric Computing and Information Sciences volume 12, Article number: 48 (2022)
Cite this article 2 Accesses
This paper describes two empirical research studies that investigated how to improve naïve users’ mental models to support end-user development (EUD) of Internet-of-Things (IoT). Specifically, we intended to evaluate the effectiveness of two different strategies, namely nudging and informing, to support trigger-action (TA) rule programming. To this aim, we analyzed non-expert users’ performance and their verbal reports (Studies 1 and 2, respectively) in a task requiring the identification of the outcomes of the execution of specific sets of TA rules in different IoT scenarios. The triggering part of TA rules typically involves instantaneous and/or protracted events, and previous studies have shown that users’ poor understanding of the distinction between these two types of events, as well as of the way in which the rules interact with each other, can result in poor TA programming performances. The first (experimental and quantitative) study shows that a nudging strategy (i.e., using two different temporal conjunctions, WHEN and WHILE, to introduce the rules’ triggering conditions that refer to the two types of events instead of using the more common and generical IF) improves participants’ understanding of the rules’ behavior. It also provides some evidence that an informing strategy (i.e., providing participants with an explicit description of how the rules are evaluated and activated) can improve participants’ accuracy in identifying the rules that did not realize the desired situation. The second (observational and qualitative) study suggests that the use of WHEN and WHILE in the triggering part of the rule helps participants distinguish the two types of events and understand their semantics. This work extends the current literature in EUD by providing both critical information about users’ mental models in IoT and useful suggestions to make appropriate (linguistic and structural) choices when designing the interface that guides users in defining the rules.
End-User Development, Internet of Things, Trigger Action Programming, Human-Computer Interaction, Human Factors in Computing Systems
As the Internet of Things (IoT) is pushing for digitalizing everyday objects , it becomes increasingly important to explore new means for users to control sensors and devices . End-user development (EUD) is defined as the possibility for people without programming experience to create or modify their applications . In this respect, it provides an interesting approach to dealing with the IoT . “Smarter” objects are often less easily accepted by users  and the possibility for naïve, non-expert users to actively control them might be a key to acceptance
. While user-centered design advocates for users’ involvement in the design phases, EUD calls for empowering users beyond these phases and proposes that design, learning, and development are inherent parts of the technology in use [6, 7].
The effectiveness of EUD in the context of IoT-based smart devices has been well demonstrated by the success of initiatives like IFTT . This popular web-based service allows users to create conditional statements triggered by changes in either devices or web apps. This metaphor is readily applicable to IoT  since IoT devices are usually either sensors that detect events in the world or actuators that operate changes in the world (or both). The IFFT approach is an example of a programming approach based on contextual rules that have evolved in the so-called trigger-action programming (TAP). A trigger-action (TA) rule takes the specific form of an action that is performed upon the occurrence of an event. Several commercial tools use a similar approach—for example, Amazon's Alexa with the so-called Alexa Routines .
Indeed, programming is complicated because it often requires expressing solutions in ways that are not familiar to non-experts . The concept of TA rule provides an intelligible metaphor for the programming of digital technologies because it embeds the idea that specific actions must be taken in specific situations .
However, the simplicity of this event-action paradigm is also its limitation. In a study conducted with over 300 MTurk workers , the authors collected 1,590 trigger-action programs in the domain of a smart home. Their analysis revealed that 77.9% of program behaviors could be expressed with rules involving single triggers and single actions, but 16.9% required multiple triggers and possibly multiple actions (the remaining 5.2% required a single trigger but multiple actions). To allow effective programming of IoT devices, people need more expressive triggering conditions and more elaborate actions than those provided for by common single trigger- single action TA rules.
Actually, several research prototypes [2, 13–15] and some commercial tools (e.g., SmartThings and SharpTools [16, 17]) permit complex triggering conditions, with multiple triggering events, and multiple actions. Indeed, the TAP paradigm inherits from the so-called ECA (event-condition-action) rules that expert programmers use as a framework for effective control of databases [18, 19] and workflows [20, 21]. In the ECA rules, the condition part can be quite complex (i.e., it may not be limited to the check of the occurrence of an event), and the action part can have the form of an elaborate routine. That allows effective control of the flow of operations while maintaining a fully expressive programming power [21, 22].
However, understanding complicated conditions is problematic for end-users [23, 24]. When the triggering conditions become more complex, the simplicity of the rule-based metaphor drastically diminishes, and users are more prone to errors. The inaccurate composition of events is among the most common errors [25, 26].
Some authors have tried to introduce a simplified version (with a fixed, simple structure) of the condition part of the ECA rules in TA rules. For example, Truong et al.  suggest limiting the condition to a syntactical specification of the location (WHERE) in which the event should take place in order for the action to be executed; similarly, the tool EFESTO-5W  only supports the specification of temporal (WHEN) and/or spatial (WHERE) aspects concerning the event in the triggering condition. In the present study, we aim to investigate the effectiveness of an approach that provides for constraining the condition part of TA rules, while allowing a richer expressivity than that of the “structural” approaches described above. In doing so, we focus on the difference between two types of events (i.e., instantaneous vs. protracted events) and propose a specific linguistic frame to nudge users to understand and use this distinction.
The distinction between these types of events is grounded in the semantics of natural language and often codified in lexical choices [29, 30]. Events are often specifically conceptualized as properties of moments. Instantaneous events are timeless (for example, “to catch a flu”). In contrast, protracted events have a duration (for example, “the presidential campaign”) but they are characterized by undefined or fuzzy time boundaries .
In the field of EUD programming for IoT environments, this distinction has been examined by Huang and Cakmak , who called instantaneous events simply “events” and protracted events “states.” They proposed that state-based programming might be exploited as an alternative to (or in combination with) event-driven programming. However, they also noted that this distinction might be problematic for the user to understand. Support for clarifying the distinction between events and states at the graphical interface level has been proposed but not further developed by Mattioli and Paterno .
We propose exploiting natural language to help users understand the difference between events and states (for the sake of simplicity, we used this terminology rather than instantaneous and protracted events).
Indeed, several languages use different conjunctions to introduce longer events (i.e., the “states” in our terminology, e.g., “while” in English, “während” in German, “mientras” in Spanish, and “mentre” in Italian) and short events (i.e., the “events” proper, e.g., “when” in English, “als” in German, “cuando” in Spanish, and “quando” in Italian). In several cases, these pairs of conjunctions can be used interchangeably. However, in multi-clause sentences, the while clause usually describes the longer event that represents the ground in which the shorter event (in the when clause) is interpreted .
Therefore, we propose to express TA rules in the form WHEN ＜event＞ WHILE ＜set of states＞ THEN ＜list of actions＞. The ＜event＞ part of the rule specified a single event. The ＜set of states＞ part is a conjunction of logical propositions on the world that can be evaluated as true or false; the set can be empty (in other words, the WHILE part can be omitted). The ＜list of actions＞ part is a sequence of actions executed in the specified order, only if the conjunction of logical propositions holds when the specified event occurs.
As Huang and Cakmak  suggested, we hypothesize that this structure for TA rules induced a more effective mental model of the system in naïve users. Indeed, the primary source of confusion in interacting with an artifact is due to users having a wrong or inaccurate mental model of the actual functioning of the system . Mental models are internal representations of (parts of) the world that explain and regulate how people interact with the world . A mental model of a complex artifact is a representation of the mechanism and working of the artifact that is developed by the user to make sense of the artifact itself and to effectively use it [34, 36]. The understanding of users’ mental models is critical for the comprehension of the interaction between users and the artifact.
A user’s mental model does not need to be complete and accurate, but it should represent the core mechanisms of the artefact. Proper design can (and should) implicitly induce effective mental models in the user of a given system . However, an adequate representation of how the system works can also be explicitly communicated. How the system is described to the users can have a strong impact on the users’ mental models of that system, and this, in turn, may result in different user-system interactions. For example, Halasz and Moran  proposed two different, albeit both correct, descriptions of the functioning of a reverse-polish calculator to two groups of participants and showed that these descriptions led to different levels of performance. One aspect that often confuses non-programmers concerns how constructs are expressed in programming languages. For example, Pane and Myers  noted that “then” is often interpreted as “afterward” instead of “in these conditions.” Therefore, it is crucial for designing EUD systems to consider how naive users interpret the language used to express the conditions in TA rules. We argue that the form “IF-THEN,” although supposedly simple, does not help naïve users understand the needed complexity of TA rules. In contrast, the “WHILE-WHEN-THEN” form might be more effective in suggesting the differences between events and states by nudging users toward a more effective mental model.
In a seminal work, Brackenbury et al.  report several bugs, many of which can be related to the confusion between events and states. Such a confusion alone does not obviously account for all the problems with the more complex forms of TA rules. Another relevant aspect is the proper understanding of the temporality of the rule mechanism. That is, the fact that the rules are cyclically applied. The lack of understanding of the cyclical mechanism determines what is called “repeated triggering” bug: when an action is conditioned on a state whose duration is longer than a single cycle, the rule can be triggered repeatedly and, therefore, the action performed several times (for example, “IF I come within 1 mile of a pizza shop THEN order me a pizza” ends up in ordering many pizzas ). We suggest that naïve users might not be fully aware of the cyclical mechanism and may form an inaccurate mental model of the system. In this work, we propose that a more explicit description of this mechanism should be provided to users, rather than simply describing the form of the rules (following the example by Halasz and Moran , discussed above). In our view, TAP systems do not (always) need to be walk-up-and-use tools and a (possibly short) learning phase is beneficial and often inevitable. Accordingly, it is critical to understand how to instruct users properly. This work is a first step in such a direction.
In summary, although there is a wide agreement that better mental models of TAP can improve the effectiveness of EUD, there is not much evidence of how this can be done. In order to investigate this issue, we conducted two separate but strictly related studies. They were aimed to analyze naïve users’ behaviors in a task that required the identification of the outcomes of the execution of given sets of TA rules in different fictitious scenarios involving a TAP system that rules an automated “smart home.” In Study 1 we analyzed participants’ performance in this task, whereas in Study 2 we analyzed participants’ verbal reports while they were performing the task. We investigated the effectiveness of two different strategies to improve users’ mental models of either the TAP system or the specific user-system interactions involved in the different scenarios: (1) a nudging strategy consisting of a language-based manipulation of the TA rules and (2) an informing strategy that consists of clarifying the iterative operational nature of rule-based systems (i.e., the cyclical mechanism). The original contribution of our research lies in providing evidence that naïve users can indeed create more effective TAP mental models when these two strategies are used. In this respect, our results support and extend the recent literature in the field of EUD [23–28, 32].
IoT is a recent approach to infrastructure information technology that provides a framework to instantiate and leverage other emerging technologies, such as edge computing . IoT environments consist of a large set of resource-constrained devices (from simple sensors to smartphones) with independent identities that operate in orchestrated ways to accomplish large and pervasive tasks. Recently, the metaphor of social networks has been proposed to account for the communication complexity arising from IoT structures . IoT poses several problems both at technological and socio-technical levels . Among the latter type of problems, security, privacy, and trust issues are particularly important. They require not only new architectural and modelling approaches , but also new approaches to interact with end-users . EUD can be leveraged as a new framework for a more responsible use of IoT . Since its very beginning, EUD has been proposed as an approach for empowering users in their relationship with technology . More recently, it has been recognized that the distinction between developers and end-users is not straightforward since end-users interested in customizing their tools may range from totally naïve users (even children programming their toys ) to technical operators with high programming skills .
In the last years, several approaches have been proposed to allow end-users with different programming expertise to program heterogeneous sets of devices. These approaches can be classified according to (1) the extent to which they can be used in different domains; (2) their coverage of either the interactive or functional part of an application; and (3) the extent to which the implementation details are hidden .
Those approaches that mainly cover the functional aspects of programming often use a flow-based approach to model the structure of the task (e.g., ), while those focusing on the interactive part of the application often rely on an event-driven paradigm [4, 9, 11, 15, 45]. In several domains, the most important aspect for users is controlling the interaction with their devices; therefore, event-driven approaches are the most used in EUD . Flow-diagrams and event-driven approaches might be combined in a single tool (e.g., [46, 47]). Block-based programming  has been largely used for presenting either flow-diagrams or event-driven paradigms to end-users in IoT contexts  and has demonstrated good usability for non-programmers [4, 50].
In order to tackle more complex problems, the flow-diagrams approach has been recently extended in so-called skills-based programming. A skill is a composition of sensing and manipulation primitives that expert programmers define. Specific tasks can be composed by non-programmers applying the skills to their devices .
Other approaches include using natural language instructions [52, 53] and the combination of natural language instructions with block-based event-driven instructions . In order to support users in expressing instructions and representing the sensors and objects to which these instructions apply, techniques of augmented reality  and tangible interaction  have also been proposed.
In this work, we adopted an event-driven approach based on ECA rules that might be easily adapted to environments wherein these techniques are employed. We used an authoring tool similar to those presented in previous studies [32, 28, 56], but with specific attention to the language used to express the instructions (i.e., the ECA rules). We did not employ block-based programming, and, in this respect, our approach is close to those leveraging on natural language.
Study 1 aimed to evaluate two different hypotheses. In line with the evidence discussed above, we posited that expressing TA rules in a linguistic form that nudges the event/state distinction improved the understanding of the effects of the rules (Hypothesis 1). Specifically, we proposed the following format: WHEN ＜event> WHILE ＜set of states＞ THEN ＜list of actions＞. The ＜event＞ part of the rule specified a single event. The＜set of states＞ part was a conjunction of logical propositions on the world that could be evaluated as true or false; the set could be empty (in other words, the WHILE part could be omitted). The ＜list of actions＞ part was a sequence of actions executed, in the specified order, only if the conjunction of logical propositions held true and when the specified event occurred. In order to evaluate this hypothesis, we planned to compare participants’ performances when they had to deal with rules that have the WHEN/WHILE/DO format with that observed when they faced rules having the (more common) IF/DO format.
We also posit that an explicit description of the cyclical mechanism of evaluation and activation of the rules might prevent some of the bugs in TAP from occurring (Hypothesis 2). In particular, we hypothesized that this description should prevent bugs related to temporality . Accordingly, we decided to compare two descriptions of how a TAP system works: a richer description in which the cyclical nature of the rules’ evaluation-and-activation mechanism was made explicit (i.e., a depiction of the system communicating a Computational model of this system) and a simpler description of the possible rule structures and of the possible combinations of logical propositions within the rules (i.e., a depiction of the system communicating a Descriptive model of this system).
With the aim of assessing these two hypotheses, Study 1 was designed as a controlled experiment in which participants were presented with eight scenarios, each describing an intended goal, together with a set of rules which were supposed to achieve that goal. In some cases, the rules were correct (i.e., they correctly achieved the intended goal). In contrast, in other cases, they were “buggy” (i.e., conditions described in the scenario did not activate the rules, or their activation had outcomes other than those intended). For each scenario, the participants had to assess whether the rules were correct or not and express their confidence in their assessment.
The experiment had two between-participants conditions (Computational vs. Descriptive depictions of the rules’ evaluation and activation mechanism) and two within-participants conditions (WHILE-WHEN-THEN vs. IF-THEN structures of the rules). The results of a smaller study, which was used as a pilot for the present one, have been published in Gallitto et al. 
Following our hypotheses, we expected that mental models induced by the WHEN-WHILE-THEN rule structure and by the Computational description were more likely to correctly represent the distinction between events and states and the changes of the rule triggering conditions over time (i.e., the temporality of the TAP system), respectively. That, in turn, should result in more accurate performances. Specifically, following Hypothesis 1, we expected participants to be more accurate when facing rules with the WHEN-WHILE-THEN structure than when they dealt with the IF-THEN structure. Furthermore, according to Hypothesis 2, we expected better performances in the case of participants to whom the Computational, rather than Descriptive, depiction of the TAP system was given. In particular, the “computational” representation of this system should help participants to detect bugs (e.g., the “repeated triggering” bug) in the buggy scenarios, which critically depended on the understanding of the cyclical nature of the rules’ evaluation and activation mechanism.
Conversely, no significant differences between performances observed with the two rule structures or with the two system descriptions were expected if these manipulations were ineffective in eliciting more appropriate mental models of the single rules or the whole TAP system. Quite the opposite, participants might not only be unable to take advantage of either richer rule structures or richer system descriptions but they might also be confused by them. According to the idea that “easy is (almost always) better” when dealing with non-programmers (cf., the approach underlying IFTTT), we might even find an advantage for either the IF-THEN or the Descriptive condition.
The study material—tutorial and scenarios—was prepared in Italian. In the rules that accompanied the scenarios, we used the Italian conjunctions “SE,” “QUANDO,” and “MENTRE” for the English “IF,” “WHEN,” and “WHILE,” respectively.
The tutorial was realized as a short, written document illustrating a “smart house” as a home environment equipped with a set of electronic devices (automatic doors and windows, automatic lights, weather station, sensors of movements and presence). Management of a smart home is an interesting application for TAP [13, 58] and it is easy to communicate. The tutorial briefly explained how these devices could be used as sensors and actuators by providing a few examples. Then, an additional short example involving a kettle was presented to illustrate the difference between events and states. A graphical representation supported the description of this example (Fig. 1). The rules were presented to participants both in the “IF-THEN” and the “WHEN-WHILE-THEN” forms. The last part of the tutorial was provided in two different versions: one aimed at communicating a Computational model for the working of the rules and another one aimed at communicating a Descriptive model. The whole tutorial (both the Italian and English versions) is included in Appendix A.
|Depiction of the TAP system||Format||Proportions of correct responses||Likert confidence rating scores|
The main objective of the second study was to get a better understanding of naïve users’ mental representations of interactions with a TAP system (i.e., the automated smart home presented in Study 1).
Following other studies aimed at eliciting mental models of technologies [e.g., 37, 61–
63], we decided to employ a qualitative approach with an interpretative stance . Participants were interviewed while performing the same task as that administered in Study 1, and verbal reports were collected and analyzed. To analyze interview data, we used the so-called thematic analysis
[65–67]. Thematic analysis is a very common type of analysis in qualitative research. It is largely used in social and health research, and it has been also successfully employed in HCI and software engineering (e.g. [68, 69]). Being a qualitative type of analysis, this approach does not strive to account for quantitative differences in the collected data but rather aims to explore the “why” of observed phenomena. Specifically, this type of analysis strives to identify patterns of topics, concepts, meanings, and ideas that come up repeatedly in the interview data. For these reasons, it was the optimal choice to explore the actual mental models created by non-expert users while interacting with the (fictitious) smart-home system based on TA rules that we proposed in the present research work, that is, it was the optimal tool to analyze the thoughts, observations, and remarks that participants freely expressed while they were performing the task and trying to predict the outcomes of these rules.
Materials, Participants, and Procedure
Study 2 used the same scenarios and rules as those in the previous study. The procedure was adapted to a qualitative study. Participants were not alone while they were performing the task, but a facilitator was present via video conference. After a short introduction of the smart home context orally presented by the facilitator, participants read the scenarios in individual sessions. The tutorial was purposefully not used in this study to let the participants freely think about the different concepts involved in the scenarios and related questions. Only the Descriptive depiction of the system was presented. Half of the participants were presented with the rules in the IF-THEN format and the other half were presented with the rules in the WHEN-WHILE-THEN format.
Participants were asked to answer the scenarios’ questions. Besides choosing one of the four alternatives, they had to verbally explain their understanding of the rules. Furthermore, participants were prompted by the facilitator to discuss their understanding of the distinction between states and events.
A total of 14 subjects (7 males and 7 females from 20 to 40 years old) participated in the study. They had been recruited using a snowball procedure starting from personal acquaintances. The inclusion criteria were the lack of any computational experience and no knowledge of programming languages. All participants were native Russian speakers except for P10, a native Portuguese speaker, and all spoke (fluent) English as a second language. Each participant was interviewed individually for about 30 to 40 minutes (8 hours overall). The interviews were conducted in English; they were audio-recorded and transcribed for analysis.
The data (participants’ verbal reports) were analyzed following the tenets of thematic analysis [65–67]. In thematic analyses, participants’ verbal reports are systematically analyzed in order to detect common topics (called “codes”). Specifically, the coding of the verbal reports is done iteratively, initially with an inductive, data-driven, approach (i.e., data are first coded without trying to fit the coding process into a preexisting coding frame and thus without focusing on the specific aims of the questions that were asked to participants). Then, the codes are grouped into clusters that are called “themes” and represent theoretical dimensions that can explain the data. Eventually, the codes are retrospectively reconsidered with a deductive approach .
The analysis of our interviews was performed by two independent evaluators who compared the outputs of their analyses and converged on a limited number of codes and themes. In this analysis, six codes were identified (i.e., events as actions, states as longer activities. states as situations, events as the starting and ending points of states, states as movements, states, and events as a function of sensors), which were related to three themes (i.e., distinction between events and states, temporality of events, actions as an overarching category) (Fig. 2).
Many participants focused on actions as a conceptual primitive for the notion of event. For example, “When I come home, it's an event because it is an action.” (P9), “It's just one moment to step into the house.” (P13), “It's an action. Someone must leave the house.” (P5), and “When you enter, it’s the action.” (P8).
Sometimes, states are also conceptualized as activities that take place in the house, but they are seen as having a different duration from that of events. For example, “When you are entering the backyard, it's a very quick action. When you are inside the backyard, that's a longer action.” (P8) and “What if I just went to my backyard to take my ...to take a tool that I need to work inside the house. I didn't stay in the backyard for at least one minute let's say.” (P8).
Our main goal was to evaluate two different strategies to improve the mental models of naïve users for programming IoT environments. We propose a nudging strategy (i.e., changing the usual structure of the rules and using two different temporal conjunctions to introduce the event- and state-parts of the triggering conditions) and an informing strategy (i.e., providing an explicit description of the cyclical nature of the rule mechanism). Our work has several limitations, including the impromptu nature of the tasks, the small number of participants involved, the fact that they were all young adults (mostly students), thus not representative of all possible users of TAP systems, and that they did not really “test” a TAP system in real life, but just discussed scenarios and chose an answer among four options, based on their comprehension of the scenarios. Nevertheless, we believe that our mixed approach, experimental and exploratory, allowed us to draw two meaningful lessons to inform the evolution of EUD systems for IoT.
The first lesson is that the two strategies do work. In particular, the WHEN-WHILE-THEN rule structure appears to be truly helpful. By using it, we can exploit the implicit linguistic knowledge about the difference between states and events. Properly designed graphical interfaces may use this structure to guide users in defining rules [25, 26, 56]. This proposal is consistent with the idea that natural language can be effectively exploited to assist TAP and with those approaches to programming based on natural language instructions (cf., [47, 52, 53]). Both Studies 1 and 2 indeed provide evidence that the use of the temporal conjunctions WHEN and WHILE may help users interpret the rules correctly when these rules involve events and states.
Nevertheless, the second study also suggests that, without detailed explanations using specific and different terms for the two types of occurrences, some users may still not be able to rationalize this difference, even though they can discuss the difference between longer and shorter events. Multi-clause rules, involving both an event and a state introduced by WHEN and WHILE, respectively, seem to have driven a better understanding of the distinction between the two types of occurrences.
These data are consistent with linguistic and psycholinguistic evidence on the comprehension of temporal sentences. As observed by de Vega et al. , when temporal sentences involve two simultaneous occurrences, one of them tends to be interpreted as the main one while the other occurrence is seen as the “ground” (i.e., the context) where the main event occurs. These authors found that (1) the occurrence that takes more time is usually seen as the ground and (2) sentences in which the longer (ground) occurrence is introduced by WHILE are judged as more acceptable and sensible than sentences in which this occurrence is introduced by WHEN. They conclude that WHILE is the temporal conjunction that people usually see as introducing prolonged occurrences (the “states” in our terminology) that act as the context for other occurrences. Accordingly, when participants are presented with two-condition rules in which the event and state conditions are introduced by WHEN and WHILE, respectively, they may more easily identify the context (the state) in which something (the event) is happening (i.e., participants may more “naturally” understand the semantics of the state and event contained in the rule, thus better understanding the distinction between them and the meaning of the whole rule). It is worth noticing, however, that WHEN and WHILE do not simply act as cues able to help participants distinguish the two different parts of the rules’ triggering conditions. Indeed, in Study 1 the event and state parts were written in different colors (in both the WHEN-WHILE-THEN and IF-THEN rules) and a list of all the possible events and states of the automated smart home was always available to participants. No confusion about whether a given occurrence was a state or event could occur. Nevertheless, the use of the WHEN-WHILE-THEN format proved to have a beneficial effect on performance.
When this format was used, there is no additional effect of the system description provided to participants: the Computational depiction of the system helped participants detect buggy scenarios but the advantage of the Computational group over the Descriptive group was only significant when the IF-THEN rules were considered. Based on these findings, we may conclude that, in such buggy conditions, either an appropriate rule format or a proper system description is enough to help people understand that rules do not work as expected. However, as noted above, by using the difference between the performances of the two groups, we may have underestimated the effect of the informing strategy: the Computational depiction of the system may have been more informative as to the cyclical mechanism of rules’ evaluation and activation, but the Descriptive depiction may have been more informative as to the conditions required to trigger the rules. Accordingly, the system mental model of participants from the Computational group might be more appropriate regarding the former aspect but less appropriate with regard to the latter.
The second lesson drawn from Studies 1 and 2 is that, in order to design effective interfaces, we need to know how people actually learn EUD and how they can develop appropriate mental models of the system with which they interact. When dealing with naïve users (i.e., the main target group in EUD), the complexity of the programming constructs is not the whole story and making them simpler is not the only solution to pursue. Results of the second study suggest that the understanding of the rules and of the task itself may be compromised by participants’ previous naïve assumptions on what sensors are, as well as by the confusion between events and states and between the user’s actions, which are often involved in the rule triggering conditions (i.e., the parts of the rule introduced by IF, WHEN or WHILE), and the system’s operations (i.e., the action part of the rule that is introduced by THEN). In fact, in participants’ verbal reports, the notion of “action” appears to be an overarching category that includes both the operations performed by the system and the description of the situation in which they are performed. The use of different terms, more specifically linked to the notion of “action” and “operation”, to introduce the action part of the rule (e.g., DO instead of THEN; cf., ), may be helpful to prevent such confusion.
To our knowledge, no research has been published on how people learn EUD. Although there have been a few longitudinal studies on how people control smart homes (e.g., [9, 58]), they targeted tech-savvy users and usually focused on appropriation more than learning. A better investigation of how people learn EUD might also bring new light to the timely topic of computational thinking. EUD and computational thinking are related but, to some extent, opposite concepts. The goal of EUD is basically to allow users without technical experience to program [6, 7]. Therefore, EUD promotes a kind of programming that does not heavily rely on specific computational skills. In contrast, computational thinking  is the kind of analytical thinking that underlies programming. How much computational thinking is needed for EUD is still an unaddressed question.
In this paper, we discuss two strategies to improve the mental models of naïve users for programming IoT environments: a nudging strategy and an informing strategy. Both studies described here provide some evidence that, when the triggering conditions of TA rules involve events and states, the rules are better interpreted if these conditions are introduced by different temporal conjunctions: WHEN (for events) and WHILE (for states). When this nudging strategy is applied, the addition of the second (informing) strategy does not provide any further significant benefit. The second study suggests that, even when the two conjunctions are used in the rules, naïve users may still not be able to rationalize the difference between events and states. This study also emphasizes the importance of (1) the mental representations that naïve people have of how automatic systems work and (2) the lexical choices made when presenting the problems to the users. Our research has several limitations (e.g., the small number of participants and the limited meaningfulness of the administered task; see Section 5 General Discussion) and further studies are undoubtedly needed to fully explore mental models in EUD, possibly with tasks in which participants are required to compose, rather than evaluate, TA rules in order to program their own IoT devices. However, we believe that the research work presented here highlights two crucial aspects: (1) how the task is explained (i.e., lexical choices, how the arguments are phrased, etc.) is essential and can be used to nudge users to create effective mental models of both the system and of their interactions with the system; (2) the representation of the domain and its components has also an impact on EUD mental models: users’ knowledge of the domain needs to be taken into account and lexical choices need to be carefully made in order to avoid inappropriate users’ mental representations. That constitutes the original contribution of our research: it supports and extends the recent literature in the field of EUD [23–28, 32], thus helping this field get closer to its ultimate goal of empowering non-expert users in a more personalized approach to IoT.
Conceptualization, BT, MZ. Funding acquisition, BT. Investigation and methodology, BT, MZ, GG. -Project administration, GG, DY. Supervision, BT, MZ. Writing of the original draft, BT, MZ. Writing of the review and editing, BT, MZ. Formal analysis, BT, MZ. Data curation, GG, DY. All the authors have proofread the final version
This research was supported by the “Progetti di Rilevante Interesse Nazionale (PRIN) 2017” Program funded by the Italian Ministry of University and Research (MUR) (No. 2017MX9T7H, EMPATHY: EMpowering People in deAling with internet of Things ecosYstems).
The authors declare that they have no competing interests.
Name : Massimo Zancanaro
Affiliation : Department of Psychology and Cognitive Science, University of Trento (ITALY)
Biography : Massimo Zancanaro is a full professor of Computer Science at the Department of Psychology and Cognitive Science of the University of Trento and the head of the Intelligent Interfaces and Interaction Research Unit at Fondazione Bruno Kessler. His research interests are in the field of Human-Computer Interaction and specifically on the topic of Intelligent Interfaces for which he is interested in investigating aspects related to the design as well as to study the reasons for use and non-use of digital technology.
Name : Giuseppe Gallitto
Affiliation : Predictive Neuroimaging Laborabory (PNI-Lab), University Hospital Essen (Germany).
Biography : Giuseppe Gallitto is a PhD student at the Predictive Neuroimaging Laborabory of the University Hospital Essen. His research activity focuses on the study of pain perception and placebo analgesia through the implementation of predictive models based on brain imaging data. Before starting his PhD, he was a research fellow at the Department of Psychology and Cognitive Science of the University of Trento.
Name : Dina Yem
Affiliation : Department of Psychology and Cognitive Science, University of Trento (ITALY)
Biography : Dina Yem recently graduated from the University of Trento with a Master’s degree in cognitive science.
Name : Barbara Treccani
Affiliation : Department of Psychology and Cognitive Science, University of Trento (ITALY)
Biography : Barbara Treccani is an associate professor of General and Experimental Psychology at the Department of Psychology and Cognitive Science of the University of Trento. Her research interests focus on both fundamental and applied research in cognitive psychology and human factors: cognitive control, response selection, stimulus-response compatibility, numerical cognition and magnitude representation, contingency learning, mental models in end-user-development.
Massimo Zancanaro1,2, Giuseppe Gallitto2,3, Dina Yem1, and Barbara Treccani1,*, Improving Mental Models in IoT End-User Development, Article number: 12:48 (2022) Cite this article 2 AccessesDownload citation
Anyone you share the following link with will be able to read this content:
Provided by the Springer Nature SharedIt content-sharing initiative