Introduction
The USE OF computers, smartphones, and other connected devices is becoming part of the daily routine of an ever-increasing number of people around the world. The increased usage of computing devices and their processing power allows us to solve problems that we could not tackle before, and at the same time, this has complexified the way society works, leading to an increased presence of nonroutine work [1]. Even people without computing skills need to use computers to carry out specific tasks in their daily lives. In this hyperconnected era, individuals must be aware of how to make the most of computers, which involves being fully capable of communicating with them and of extracting all their computing potential to solve complex problems in a wide range of domains [2]. As such, computational thinking (CT) is part of the essential skill set that a student should master in order to solve problems in the digital era [3]. This may include several key concepts such as abstraction, decomposition, pattern recognition, and algorithms [4]. However, the assessment of CT competence is not straightforward [5] due to the plethora of concepts involved, the fact that frameworks are different across authors, and the lack of validated tools. Furthermore, the recent COVID-19 health crisis has introduced additional complexity as many courses can no longer rely on in-class teaching support and have to be exclusively taught online. These different factors mean that educators should pay particular attention in engaging students in active learning to increase learning gains [6] and to promote specific pedagogies likely to increase their motivation, such as gamification [7], [8]. In majors outside of computer science, CT might not be perceived as essential, and students might lack the intrinsic motivation to engage fully in learning, which may prevent them from acquiring solid CT skills [9].
Computational notebooks are promising tools for teaching students how to solve complex problems using a programming language [10], [11]. These tools allow students to recreate and simulate exercises in an interactive manner, where they can manipulate chunks of code and observe the results of their actions in real time.
This study tackles this specific issue and brings new insights through a multidimensional evaluation approach of CT skills using multiple sources of data. Quantitative scores, insights of problem-solving strategies deployed by students, and usage data from the computational notebook used as course support have been analyzed. This article also includes a controlled experiment with a gamified feedback feature. In particular, it makes the research contributions outlined below.
A. Contributions
First, this article introduces a novel computational notebook environment using the Graasp open digital education platform with associated learning scenarios [12]. The computational notebook application offers a rich learning environment with dynamic code execution, integrated learning analytics, and modular gamification modules.
Second, this article presents a field study conducted using data captured during a full semester introductory course on information technology for first-year undergraduate students in business and economics. More specifically, we analyzed the data of 115 students who took the lecture course between February and June 2021 and who agreed to participate in this study.
Third, in the context of non-CS undergraduate students, this article investigates whether computational notebooks can support active learning scenarios for promoting CT skills in non-CS students (RQ1), how engagement with computational notebooks is associated with student situational motivation (RQ2), and how gamification can contribute to increased engagement with computational notebooks (RQ3).
B. Roadmap
The rest of this article is organized as follows. Section II defines CT and discusses related work about computational notebooks and related motivational aspects and gamification mechanisms that may influence engagement in the context of CT knowledge acquisition. Section III presents Graasp, the education platform used for this study, as well as the learning scenarios. Section IV presents the research case study, the data used to carry out the analysis, and the techniques applied to address this research. Section V presents the results about each research question. Section VI summarizes the main insights. Finally, Section VII concludes this article.
Related Work
CT and its use in educational settings stems from the work of Seymour Papert at the Massachusetts Institute of Technology in the second half of the 20th century [13]. Papert proposed that computers should be an integral tool of young people's learning and put forward the use of programming languages such as Logo. The topic of CT has re-emerged as an increasingly relevant issue in education over the past few years. Jeanette Wing—considered to be the author who coined the term CT—asserts that it is a fundamental competence for everyone, not just for computer scientists [14]. Although there is a plethora of definitions and conceptualizations of the term, Jeanette Wing has conceived CT as the thought processes involved in formulating problems and their solutions so that the solutions are represented in a form that can be effectively carried out by an information-processing agent. However, the promotion of CT in the classroom is challenging because—among other reasons—research on how to teach and learn CT in the classroom is scarce and does not provide clear measures as to which pedagogical methods are most effective [15]. If we want to improve these teaching practices at the university level, we must be able to distinguish effective methodologies and motivational affordances, such as gamification. The tools built up to now to evaluate CT in higher education are, to the best of our knowledge, still somewhat limited [16]. A literature review of computational notebooks, motivational aspects, gamification mechanics, and existing evaluation tools in a CT educational context is presented in the following.
A. Computational Notebooks
Previous studies demonstrated that students increase their conceptual comprehension, critical thinking, and interpersonal skills when they participate actively in their study [17]. Such active participation is better known in the literature as active learning and is a teaching method that pushes students to continually assess their understanding by doing things. Active learning is an effective alternative to more passive types of knowledge acquisition, such as attending lectures [17]. One way to apply active learning in CT-related knowledge acquisition relies on blended learning. Blended learning combines traditional face-to-face learning with digital interaction in class or at home [18]. The shift to blended learning has been a key trend in education in the past decade. Currently, most learning activities are delivered using a blended approach to some degree [19], [20]. Blended learning also provides digital education platforms with the possibility to integrate learning analytics into the instructor's awareness and reflection processes, potentially allowing instructors—and other stakeholders (e.g., parents and researchers)—to assess how students are performing and to predict student success or failure early on in the course [21]. Furthermore, a blended learning approach can also potentially be used in a fully online learning context [22].
Blended learning is particularly applicable to introductory programming courses [23], which often incorporate rich learning environments with dynamic code integration, such as computational notebooks. Computational notebooks, which are widely used in data science education [24], combine code snippets and text with other multimedia content to create rich interactive environments for data exploration and programming [25]. Combining an online coding environment without the need for external software, and the ability to run code embedded in text and multimedia content, makes computational notebooks a tool well suited to teaching CT [26]. Previous work has explored the use of computational notebooks to teach CT in different learning activities. For instance, researchers evaluated their use for 1) lectures; 2) reading; 3) homework; and 4) exams [26].
The Jupyter Notebook [27] (henceforth Jupyter)—a popular computational notebook—has seen a particularly significant increase in popularity over the past few years, becoming a valuable teaching tool. One of the keys for Jupyter's rise in popularity is its support for the Python programming language, whose simplicity and readability make it attractive as an introductory programming language [28]. As such, Jupyter is becoming more popular in introductory Python courses [29], [30], despite the fact that there are many other web-based tools that have been suggested for teaching Python [31], [32]. This preference for Jupyter could be explained by the fact that it offers many features aimed at students, including the ability to work on coding assignments without having to switch between the assignment's instructions and the coding software [33]. Furthermore, Jupyter includes many tools that are specifically made for teaching, such as grading modules [33]. Certain personalized learning environments (PLEs) allowing the creation of rich interactive learning spaces, gamified learning experiences, and learning analytics have also started to provide support for computational notebook integration [12].
Nevertheless, these notebooks can also have a negative effect on learning. Some argue that they encourage poor coding practices, given that it is not straightforward to break down code into smaller, reusable modules, and that it is hard to write and run tests [27]. Furthermore, the fact that computational notebooks are used both for exploratory and explanatory purposes can also lead to complications, since it takes a lot of effort to transform a messy exploratory notebook into a clean one that can be shared with others [34]. Moreover, these environments lack support for greater interaction, collaboration, activity awareness, access control, and other features [25]. Therefore, it has been argued that while computational notebooks can be useful for introductory-level students, they are not suitable for more experienced learners [35]. To address this issue, notebooks can be customized according to learning preferences, programming experience, and learning context [26]. The above observations lead to the following research question:
RQ1. Can computational notebooks support active learning scenarios for promoting CT skills in non-CS students?
B. Motivation
As active learning scenarios rely on voluntary student engagement, it raises the question of the underlying motivations that drive or hinder engagement. As non-CS undergraduates may not perceive CT as essential, which could potentially make it difficult for them to develop strong CT skills [9], it is critical to understand the motivational aspects of students engaging in active learning scenarios. This observation leads to the following research question:
RQ2. How is the engagement with computational notebooks associated with student situational motivation?
Over the past 60 years, self-determination theory (SDT) has emerged as a fundamental theory of human motivation [36]. SDT's basic premises propose that motivation operates on three levels: global, contextual, and situational [37], [38]. Motivation on a global scale reflects how an individual interacts with his or her surroundings in general [38]. A motivating tendency toward a certain setting, such as a job or education, is known as contextual motivation [37]. Situational motivation relates to the “here and now” of motivation or the motivation felt when participating in a certain activity [37]. All three levels can be further refined and described by various constructs, among them the motivational factors proposed by SDT [39], [40]: intrinsic motivation, identified regulation, external regulation, and amotivation, constituting a self-determination continuum from self-determined to non-self-determined motivation. Intrinsically motivated behaviors are those that are done for the purpose of doing them or for the pleasure and satisfaction that comes from doing them [39]. In contrast, extrinsic motivation refers to a wide range of behaviors in which the goals of action are not limited to those that are inherent in the activity [39]. Different types of extrinsic motivations have been proposed by SDT; these are external and identified regulations [39], [40]. External regulation happens when behavior is regulated by rewards or to avoid negative consequences. Identified regulation, on the other hand, happens when a behavior is valued and viewed as one's own choice. However, the motivation still remains extrinsic because the activity is done as a means to an aim rather than for its own sake. Amotivation defines a completely nonautonomous behavior, with no drive to speak of and likely struggling to have any of one's self needs met. To measure a person's situational motivation, the Situational Motivation Scale (SIMS) can be used, as it demonstrates good reliability and factorial validity in educational context [41].
C. Gamification
For the use of gamified settings to promote CT, Kotini and Tzelepi [42] find that the use of gamification—e.g., using grading characteristics comparable with those of video games, such as points or levels—can increase the engagement of students. There are many types of settings one can apply, and instructional design has to be careful not to only promote external goals, such as points and prizes related to performance, because this would only lead to increasing the extrinsic motivation of students. The educational setting also has to integrate aspects that can grow students’ interest in mastering their learning, thus leading to promoting intrinsic motivation as well. One key element is whether gamification can provide feedback and scaffolding for students and, if so, by which means. Providing feedback for learning activities has long been identified as an important component allowing students to identify gaps and to assess their learning progress [43]. Some experiments [44] have shown that gamified environments where the digital environment itself produces the scaffolds are necessary so that students’ acquisition of CT skills can be implemented. In another study, where a mobile app game was used to promote CT [45], the authors found that, generally, the average time that students spent on a level in the game increased with the level of progression. Other studies [46] found that it is the didactic sequence itself that scaffolds the students to acquire CT, and the authors report an increasing learning rate in the experimental group compared to the control group.
However, the literature does not clarify what role gamification can play in affecting learning outcomes and student engagement in the context of higher education, specifically in the case of non-CS students aiming to acquire CT skills. Given the different kinds of tools that appear in the literature, it seems wise to use a combination of tools that can provide greater reliability to evaluate students’ CT skills and cover the different facets of their competence. This is precisely the perspective that will be adopted in this article, where we will use multiple instruments to assess a student's CT expertise based both on programming and nonprogramming activities. The above observations lead to the following research question:
RQ3. How can gamification contribute to increased engagement with a computational notebook?
D. Tools for Evaluating CT Skills
Competency-based tests propose abstract items for assessing CT skills. For example, Gouws et al. [47] created a test to evaluate CT performance in higher education students. Sometimes, tests created for other purposes have been used as a tool to measure CT skills (e.g., including tasks related to conservation or probabilistic reasoning). That is the case for the GALT test [48], which was used, for instance, in the context of higher education [49]. Recently, Lafuente et al. [50] have developed a psychometric test to evaluate algorithmic thinking skills. The authors validated the test based on factor analyses and opinions of experts in the field, obtaining a 20-item test capable of discriminating experts in CT from students without any training in computational issues.
Self-assessment tools have been developed so that students can evaluate by themselves to what extent they have mastered different skills related to CT [51], [52]. These tests have been validated by researchers and used by students in higher education. However, self-reported questionnaires may yield measurement errors based on an overestimation of the student's own skills or lack of understanding of the concepts involved in the questionnaire [53]. This type of tool also includes interviews, which are used to extract qualitative evidence, mainly of the thought processes used by students to solve CT tasks [54].
Exams and other ad hoc tools are probably the most frequently used tools to evaluate CT [55]. The authors usually construct an artifact with tasks that resemble very much the ones used in the classroom for teaching and learning the subject (i.e., the evaluation tool is an exam), and very often the tools include the use of programming in a language that students have been learning in the class. These tools are mainly oriented to evaluating a student's CT-related knowledge. Likewise, portfolios and reports constructed by students are also used to evaluate CT competence, using evidence of understanding and achievement in CT-related activities [56]. Furthermore, the ability to properly assess a student's acquisition of CT skills could also provide valuable insight into how CT should be taught in the classroom, which is an active area of research [57].
This article will make use of this body of research to design, implement, and evaluate adequate support for promoting CT skills.
Graasp Digital Notebook
The digital notebook environment studied in this article is built using the Graasp PLE. Graasp is an open digital education platform providing two interfaces [12]. An authoring view allows instructors to combine and configure resources that they use to create their online lessons, which we refer to as learning capsules. Learning capsules can then be broken down into step-by-step exercises, which can be contextualized with text, images, links, chatrooms, and other interactive content (see Fig. 1). The second interface is the live view, a student-oriented environment that can be accessed through a link. By clicking on the link, students can take part in the online lesson, navigating through pages that contain lectures and exercise materials prepared by the instructor.
To provide a context resembling computational notebooks, we designed an open-source coding application (henceforth the code app1) to provide a ready-made Python environment within Graasp. The code app uses the Pyodide2 library to execute Python directly on the browser without any additional dependencies. Students can read and write files, provide manual input, and generate graphics using libraries such as Matplotlib [58]. The code app also includes a command-line interface to display output and to allow students to navigate a virtual file system, as well as a feedback functionality that allows instructors to review and annotate the code written by students. To enable advanced features such as custom configuration, saving student-generated code, and tracking learning analytics, the code app can leverage application programming interfaces (APIs) exposed by digital education platforms. In our case, we use Graasp's API to preconfigure the code app with sample code, data files, and instructions for students. Within the live view, students could then write, execute and save code, review feedback provided by the instructor, and visualize any graphical output.
The code app was then coupled with two others Graasp apps to gamify the active learning experience. The first additional app is a simple answer app,3 which allows students to enter an answer and get feedback if it is correct or not. The second app is a point counter app.4 This point counter is a gamification app added to the learning space which reads the output of the answer app, i.e., it adds points to the score if an answer is correct, and removes points if a hint is displayed.
A. Learning Scenario 1: Active Lectures
This learning scenario supports knowledge transmission by an instructor in a live session, whether remote or in-class. It aims to make traditional lectures more interactive by providing dynamic slides to students who can write and execute their own code during the lecture. The goal is that in a first step, students follow the code that the instructor presents. Then, in a second step, students are encouraged to deviate from the code presented, in order to test some corner cases or validate some expected behaviors. Using the computational notebook, this scenario allows instructors to structure the course content into blocks or slides, each with an independent space to write and execute code and possibly images, videos, or other interactive content. In this real-time learning scenario, it is expected that students move along the slides at the same pace as the instructor. Fig. 2 shows a typical example of a learning capsule with different slides (e.g., definition and memory). In the selected slide (Definition), there is a block of static text with the interactive code app below. Concretely, the learning activity in Fig. 2 depicts one of the Python lessons and showcases one of the hands-on exercises performed during the theoretical part of the course. In this example, students are presented with a list they should print and index, providing an introduction to the concept of list in Python.
Interactive lecture: a computational notebook learning capsule on Graasp. The instructor and students can write and execute code during the course.
B. Learning Scenario 2—Self-Guided Laboratories
This learning scenario aims to support self-guided knowledge acquisition during laboratory sessions. The idea is to present students with exercises and to include autocorrection and formative feedback. Several tools can be included within the learning capsule to provide formative feedback. A simple input app allows students to submit text, while a real-time communication app enables students to spontaneously ask questions and to respond to multiple-choice questions posed by the instructor. Students could also use the app to complete homework assignments and provide answers to the problems presented during the laboratory sessions. Fig. 3 shows three apps in the learning capsule to support laboratory sessions. The first is the code app, which allows students to run code. It should be noted that it can make use of hidden lines of code that can be executed before or after the visible code. Second, there is the answer app, which allows students to enter an answer and get feedback if it is correct or not correct. This app also allows teachers to set a hint for each question. Such a hint can then be displayed by students if they wish. Third, there is the point counter app on the right-hand side of the live view. This point counter app reads the output of the answer app, illustrating accumulation of points for each correct answer given, but also the loss of points when asking for a hint. The goal was to increase the time spent by students on activities by decreasing their need for help, i.e., the number of hints asked for.
Laboratory support with Graasp. A self-guided learning activity with visual point feedback.
Methodology
In this section, we present the research case study, the data we used to carry out this analysis, and the techniques we applied to address our research questions. The case study for this article is a full semester introductory course on information technology for first-year undergraduate students in business and economics (February–June 2021). This course consisted of two 45-min periods per week of theoretical lectures and two 45-min periods per week of laboratory sessions (see Table I). This course covers an introduction to CT concepts (two weeks), introduction to spreadsheet formulas and computational models (five weeks), Python programming (four weeks), web technologies (two weeks), and a final week with an exam dry run. The course is evaluated through a 1-h online exam. During the first and the last week, respectively, students filled in a pre- and a post-test survey, which inquired about their CT skills and attitudes. Out of the 115 students in the course, 112 gave their consent for this study.
A. Learning Outcome Data
There were no prerequisites for this course, and the learning outcomes of the course were for students: 1) to be able to conceptualize problems computationally, i.e., use CT principles to describe and attempt to solve problems; and 2) to be able to solve simple problems algorithmically using the Python programming language. These learning outcomes were measured informally at the beginning (pretest) and at the end of the course (post-test) and formally during the written exam at the end of the semester.
The pre- and post-tests were each composed of six problem-solving questions and six Python programming questions (examples are given in Figs. 4 and 5). The problem-solving questions were extracted from the Algorithmic Thinking Test for Adults [50]. For all questions, there was one correct answer. For all problem-solving questions, besides asking students for an answer, we asked them to provide a textual description of their problem-solving strategy for tackling the problem. This second part was not taken into account for the scoring of their answer, but allowed us to get an impression of how many CT concepts and higher levels of thinking were used in the process of solving or attempting to solve the questions.
Example of a general problem-solving question. Note that the question has two components. The first is quantitative and requires a precise answer; the second is qualitative and requires an open-ended answer describing the problem-solving strategy.
Example of three basic programming questions. These questions each require a precise answer.
More specifically, we analyzed the six problem-solving questions of the pre- and post-tests, where students had to explain their reasoning process. We sought to determine whether the key terms and concepts presented during the course had been assimilated and reused in the explanations given by the students using an approach inspired by grounded theory [59], [60] and open coding techniques, making the categories emerge from theoretical content of the courses, resulting in 22 different concepts:
For each word of the student explanations and for each target concept presented here above, we performed lemmatization to transform words with roughly the same semantics to one standard form. Lemmatization was performed through WordNet corpus of the Python Natural Language Toolkit (NLTK). WordNet is a large, freely, and publicly available lexical database for the English language, establishing structured semantic relationships between words.5
The final exam consisted of five open-ended Python questions, asking for simple functions or programs, such as: “Write a function that takes two parameters as input (a string called word and an integer called
B. Lecture Data
During the lectures, we used learning analytics in Graasp to track student attendance and visual analysis to evaluate if the student followed the lecture. As an example, Fig. 6 shows a learning dashboard to track user activity. More specifically, it shows the order in which each student has visited the pages available in the live view, as well as the time spent on each of them. If the instructor uses the live view at the same time, then the instructor's data can be compared against the student's data. Each color represents a page inside the live view. If students were to be perfectly synchronized with the instructor, their color patterns would all be the same. The activity of each student could then be visually evaluated assigning a score of 0 if the student was absent, 1 if participation was passive, and 2 if active (perfectly synchronized).
C. Laboratory Session Data
During the Python programming laboratory sessions, students were randomly split into treatment (70 students) and control group (45 students). The students went through four series of 15 exercises, a total of 60 exercises. The various series of exercises corresponding to the different topics introduced each week in the theoretical courses were: 1) variables and conditions; 2) loops; 3) lists; and 4) functions. The treatment group was provided with an extensive gamified feedback, including a level visualization (point counter App), as shown in Fig. 3, while the control group had limited feedback, only knowing if their answers were right or wrong. The gamified feedback appears on the right-hand side of the interface in the form of a chain of bubbles that scrolls along with the score. To help them in the resolution of the exercise, students could ask for hints. For each exercise, the code block of the computational notebook was preconfigured to perform a list of tests on the execution of students’ code, providing validation keys. Once the student's algorithm could execute properly, a validation key was returned back. For each exercise, two different validation keys could be received back by the students. In the first case, the algorithm acts as expected, while in the second one, the algorithm does not act as expected. There were no limits on the number of tests and executions of the algorithms, and students were not aware of the meaning of the validation key received. Validation keys were randomly predefined and, therefore, different for each exercise. These validation keys should then be submitted by the students in the answer app. When submitting their validation key, a check mark or a cross allows feedback to be provided to the students on the correctness of their algorithm. Furthermore, for each correct answer on the first attempt, students get three points. For each correct answer provided after the first attempt, students get two points. For every hint revealed, students lose one point. The control group has no visual feedback of its score. Fig. 3 illustrates one exercise of the self-guided laboratory session regarding Python functions. The hidden hint for that specific exercise being “You should be able to write this algorithm in one line.”
D. Psychometric and Demographic Data
In addition to the above data, we also collected demographic data, student situational motivation, and the computational notebook usability level.
Student motivation was assessed in the post-test survey, which aimed to measure their motivation to perform the laboratory sessions through the computational notebook. Situational motivation was assessed using the 16-item SIMS. SIMS is designed to assess intrinsic motivation, identified regulation, external regulation, and amotivation [41].
In order to evaluate the usability level of the computational notebook, the students answered ten questions about the computational notebook, based on the system usability scale (SUS) [61] at the end of the post-test.
E. Path Model and Analysis
To provide a global view of the different factors influencing the learning outcomes, we designed a path model and conducted a partial least squares (PLS) analysis technique using SmartPLS. PLS is a variance-based structural equation modeling (SEM) analysis technique increasingly popular for analyzing explanation and prediction of information systems phenomena [62]. Central to PLS is the path model that can be visualized by a diagram that displays the hypotheses and variable relationships to be estimated in an SEM analysis [63]. T-statistics are used to test the proposed hypotheses for the standardized path coefficients, by specifying the same number of cases that existed in the dataset and bootstrapping 1000 resamples. The resulting design of the path model for this analysis is depicted in Fig. 7. It contains three main independent variables: 1) initial skills, as measured by the score on the pretest; 2) situational motivation, as measured by the SIMS; and 3) gamified feedback, which indicates whether the student was in the gamified feedback condition or not. Note that situational motivation can be further broken down into its four components (intrinsic motivation, identified regulation, external regulation, and amotivation). These variables potentially influence laboratory performance positively [64], as measured with the score on the laboratory exercises and the engagement on the online platform. Engagement is measured by tracking student interactions (i.e., number of clicks, number of code executions, and text written) on the Graasp platform. “Need for help” construct is measured by the number of hints requested by a student (the more hints, the greater the need for help). Gamified feedback can motivate people to perform tasks that will increase virtual rewards (e.g., points) [65]. As such, we hypothesize that it will increase laboratory performance and reduce the need for help. In other words, this would mean that gamification of the activity, as well as increased motivation would lead students to try to get the answers on their own to get more points, without asking for hints. Finally, laboratory performance and initial skills potentially positively influence the learning outcome [66], as measured by the grade of the exam.
Path model. Positive influence (+) is expected among the linked constructs of the model.
To validate the reflective constructs of our path model (i.e., laboratory performance, intrinsic motivation, identified regulation, external regulation, and amotivation), we evaluated their reliability, convergence, and discriminant validity.
1) Reliability
We used composite reliability (CR) and average variance extracted (AVE) as indicators. As shown in Table II, the CR of the constructs was greater than 0.7 and the AVE greater than 0.5; thus, these constructs are reliable [62].
2) Convergent Validity
We used the outer loadings and the AVE as convergent validity indicators [67]. The outer loadings of all our reflective variables were above 0.7, which is the standard threshold [67], except for one variable in the amotivation construct, which was only above 0.5. As this study should be deemed as exploratory research—indicators between 0.4 and 0.7 were kept, as recommended by Hair, Jr., et al. [67].
3) Discriminant Validity
We used the heterotrait–monotrait ratio as a measure of discriminant validity [62]. Values lower than 0.85 are considered as acceptable for conceptually distinct constructs [62]. As shown in Table III, with the exception of Amotivation
Results
A. Can Computational Notebooks Support Active Learning Scenarios for Promoting CT Skills in Non-CS Students? (RQ1)
To answer the first research question, we evaluate if the notebook was considered usable, if it was used as intended in the learning scenarios, and whether there were learning gains.
1) Usability
The average SUS score is equal to 67.4 (
2) Learning Scenario
Using data from the learning dashboard presented in Fig. 6, we examined usage patterns from week 10 to week 12, as shown in Fig. 8.
The dashboard gives a visual impression of how synchronized students are during the lecture. Note that the first slide (blue on the bottom) is always a pen and paper exercise, which explains why students are not always looking at the slide on the computational notebook. A visual analysis shows that during the lecture on week 10, 62 followed at least part of the lecture on the computational notebook and 40 of them (64.5%) followed actively (meaning that around 80% of the lecture material was followed in the same order as the instructor, switching slides at around the same time), the other 22 students are considered as following the course passively. In week 11, there were a total of 49 students online, among them 39 were active (79.5%). In week 12, there were a total of 50 online, of whom 39 were active again (78%), and 33 were the same as the previous lecture.
The overall engagement of students during the live online lecture is depicted in Fig. 9. It shows how many students were mostly active, mostly passive, or absent during these three lectures (
Finally, we also analyzed whether students watched the recordings of the course that were put online after the lecture. Fig. 10 shows student engagement with the lecture (live on the computational notebook and after the lecture viewing videos) over three weeks. Real-time activity is reported as a percentage of active participation in the real-time lecture discussed above. The video watching activity is reported as a percentage of the total time of the videos posted for the three weeks capped at a 100%.6 The total length of videos posted was 149 min for the three lectures. The results show that a significant number of students (around 20%) did not participate actively in the live lessons, but nevertheless watched the videos at home. One student spent more than 900 min watching videos.
Student engagement with the lecture in number of minutes of attention, live on the computational notebook, or later by watching videos (
3) Learning Gains
The main goal of this analysis is to explore the evolution of the CT skills of the students and to see if we can observe differences between their initial knowledge and their learning outcomes (Python score and CT score). To answer this question, we compared student scores on the pre- and post-tests as well as the evolution of their problem-solving strategies (CT concepts). Fig. 11 provides a visual overview of the results of the mean scores in percentage points with pretest results as baseline. To perform the analysis, mean scores were normalized and ranged between 0 and 1, which represents the maximum achievable score.
Pre- and post-test results (pretest used as baseline). ***
When it comes to the Python score, a statistically significant difference was found for the Python exercises (
Regarding CT score, a paired
Finally, in terms of CT concepts, we analyzed student answers to the CT questions from a semantic and linguistic point of view. To observe the potential evolution of the use of such conceptual terms in the problem-solving explanations given by the students, we compared the appearance frequency of each concept in students’ pre- and post-test explanations. The results show a statistically significant (
4) Learning Outcome
Fig. 12 gives an overview of the final grades of the course (
Looking at engagement with the lecture material in real time on the computational notebook, a median split of the student grade results (pass/fail) ordered according to the time that they spent following the lecture created two natural groups: one with high engagement and one with low engagement. A
B. How is the Engagement With Computational Notebooks Associated With Students Situational Motivation? (RQ2)
To answer this second research question, we evaluate if the engagement with computational notebooks is associated with students’ intrinsic motivation, identified regulation, external regulation, and amotivation. The analysis was performed on the subsample of 84 students who filled the pretest survey, numbering 46 males and 38 females. As measured by the SIMS, the mean of student motivational aspects was calculated. Fig. 13 presents student motivational aspects in participating to laboratory sessions during weeks 10–12. The highest motivational aspect was identified regulation (
Student mean intrinsic motivation (IM), identified regulation (IR), external regulation (ER), and amotivation (AM), as measured by the SIMS, to use computational notebooks in the context of weeks 10–12 laboratory sessions.
C. How can Gamification Contribute to Increased Engagement With a Computational Notebook? (RQ3)
To answer this research question, we evaluate if the gamification of the laboratory sessions contributed to increasing the laboratory session performance and allowed a decrease of undesired behavior, in this case need for help. The analysis was performed on the same subsample and the same path model as for RQ2 (see Fig. 14). The gamified feedback functionality was implemented, as shown in Fig. 3, rewarding students for accurate answers and penalizing them for hints revealed. A control group (
1) Increasing Laboratory Session Engagement
Fig. 14 shows that the link between gamified feedback and laboratory session performance is significant
Laboratory session engagement, resulting from activity tracking on Graasp, for students with gamified feedback (treatment) versus students without gamified feedback (control).
Laboratory session scores for students with gamified feedback (treatment) versus students without gamified feedback (control).
2) Decreasing Need for Help
Fig. 14 shows that the link between gamified feedback and need for help is significant
Need for help (i.e., hints requested) for students with gamified feedback (treatment) versus students without gamified feedback (control).
Discussion
This article provides the results of a semester-long field study with 115 non-CS students on the use, and the gamification, of an innovative computational notebook environment aimed at developing CT skills. The field study was conducted in an introductory course on information technology for first-year undergraduates in business and economics, where CT may not be seen as necessary by the students. This research assessed computational notebooks and CT skills making use of pre- and post-test surveys, learning analytics, and student-generated data from laboratory sessions. We showed that it is feasible and valuable to teach CT competence to non-CS students, and that computational notebooks are an appropriate tool for introducing CT and programming to students with less technical academic backgrounds. Furthermore, our results convey the fact that gamification can increase engagement with these notebooks. A detailed discussion is presented in the following, with limitations and potential further work.
A. Computational Notebooks in a Distance Learning Context
The pre–post approach and the multifaceted pool of assessment tools that we used allowed us to find that students gained CT skills both in terms of general problem-solving (CT score) and programming skills (Python score), as well as in terms of adopting more analytical problem-solving strategies (CT concepts). The integrated aspect of the computational notebooks allowed, among other things, students to avoid having multiple opened files and programs on their computers, which, at the same time, avoids installation issues. Still, the nature of such notebooks enabled dynamic class activities.
As such, this article endorses the notion that computational notebooks can support active learning scenarios for promoting CT skills in non-CS students (RQ1). Unsurprisingly, we found that initial knowledge—operationalized with the programming scores of the pretest—was strongly linked to the learning outcomes. Yet, this study demonstrated that self-directed laboratory performance was an even more important predictor of the learning outcomes. The better a student performed in the online labs, the better their final grades. Furthermore, initial knowledge was not a significant predictor of self-directed laboratory performance, which indicated that students were not put off by the technology.
Our analyses showed that participating in the real-time lecture was associated with an increased pass rate compared to students who chose not to attend or only had limited engagement in the live online lectures. This finding supports previous research, which shows that active learning is more effective for knowledge acquisition [17]. However, this finding is not completely aligned with literature claiming the importance of active learning in both live and remote learning context [69], [70]. In fact, we admit some reservations about remote offline active teaching scenarios, which did not, in this context and in contrast with real-time scenarios, have links with learning outcomes. These distance-teaching observations have been made in the particular context of COVID-19 in which this study took place, where in-class teaching was at that time not possible. Students could either follow the course in real-time online or watch recorded videos of the course at home. This study showed that the pass rate of the more active students was 76%, whereas the pass rate of the less active was only 47%. It should be noted, however, that around 32% of the students who took the exam were not engaged at all with the real-time lectures. This result should be seen in perspective with a previous preliminary study, which showed around 90% of students actively following the lecture in a physical in-class setting [10].
This work is not without limitations. Indeed, although this study covered one course during a full semester with 115 students, the conclusions could be more generalizable if the results integrated more courses with more instructors. In fact, our results are valid for our own class, but not necessarily for all other non-CS classes, as we do not know if this class would be representative for all other classes. From the point of view of remote learning, this study was able to demonstrate the strengths of tools such as Graasp, but the measure of remote engagement is always subject to variables that cannot be easily controlled. For instance, remote engagement via videos replayed offline is potentially subject to overcounting or undercounting as a video can be viewed by a student without them paying attention to it. Finer-grained learning analytics and links between several dissociated learning systems (learning management system, computational notebook, and video repository) could inform about such differences . It would be interesting to conduct a similar study over a longer period of time with a larger number of participants allowing the positive long-term results to be verified. Future work could investigate the added value of lectures viewed on replay and on what kind of scenarios the student's learning can be supported. This is especially important with remote teaching becoming more prevalent.
Another limitation lays in the simple but innovative approach used to measure the evolution of students’ CT concepts. The approach was to ask participants to adopt a kind of think-aloud approach to describe how they approached the problem. The strategy used for the analysis was rudimentary (counting the frequency of keywords) and could be extended and improved in future work. This syntactic analysis of the cognitive approach to solving CT problems seemed to be an interesting line of investigation and also deserves to be explored further. It should be noted that we first conducted an analysis using advanced lexical analysis tools, such as the software called Linguistic Inquiry and Word Count (LIWC) to extract linguistic features [71], more precisely the 2015 version of the LIWC dictionary [72]. However, such tools did not appear to generate valid results as a simple and inconsistent problem-solving strategy description such as “Yes” received a higher score than some obviously more appropriate ones such as “By trial and error.” Future research making use of problem-solving reflections and interpretation of these advanced tools such as LIWC deserve, in our opinion, to be conducted further.
B. Computational Notebooks Motivational Aspects and Gamification
This article has found motivation and gamified feedback to be strong predictors of laboratory performance. This finding supports previous research that finds that technological platforms can provide scaffolding for CT skill acquisition in gamified settings [45]. This study has shown how various motivational aspects may influence student performance and behavior in gamified settings. The results of this research have shown that the engagement with computational notebooks is associated with student situational motivation (RQ2). Specifically, we found that the influences of intrinsic and extrinsic motivation were completely different: while intrinsic motivation led to better laboratory performance and better learning outcomes, extrinsic motivation (as understood by external regulation) decreased a student's self-regulation, making them look for more hints to quickly solve the task. Moreover, this study demonstrated that gamification can contribute to increased engagement with the computational notebook (RQ3). This study showed that gamified feedback influenced positively self-directed laboratory performance (score and engagement), and it also showed that gamified feedback influenced negatively unwanted behavior, such as the hints requested to complete the exercises in a quick and possibly less thoughtful way. We believe that the use of computational notebook environments such as Graasp, integrated with other applications, especially those introducing gamification, can open doors to other interesting avenues of research. The results presented in this article have already demonstrated the benefits of such a gamification approach, aligning with previous studies [44], [45] showing that the technological environment itself can successfully encourage feedback to drive CT skills learning in gamified settings. These results demonstrate that learning activity designers can encourage certain desired behaviors and, at the same time, discourage certain undesirable ones by using a well-designed gamification mechanism. Computational notebook environments—when used in combination with multiple instruments allowing us to assess students’ CT expertise and covering different facets of students’ knowledge acquisition—are clearly opening a new perspective, but it still needs to be investigated further.
Unfortunately, even though the sample size was adequate for the results presented above, it was too small to assess more fine-grained group differences and interaction effects (e.g., female versus male, advanced versus beginners). Future research should assess whether the effects of gamification play out in a similar direction for such subgroups. It is particularly important to confirm that students with less initial knowledge or who are less inclined toward CT are not left behind or negatively affected by certain gamification features. Based on such finer grained analysis, personalized gamification mechanisms could be deployed to adequately motivate students if no one-size-fits-all mechanism is found. We believe that future research and development on more integrated computational notebook environments will be encouraged by our results. The combination of learning analytics with real-time coding support and gamification features was central to our study. Nevertheless, most computational notebooks do not yet allow the composition of such rich learning activities. Open format and the availability of tracking data on student activities should be encouraged across the various computational notebook environments; this would potentially open the doors to further improvements in knowledge acquisition of complex skills such as CT.
Conclusion
This article addressed the issue of teaching CT to non-CS students. We conducted a field study in a real classroom with 115 students using a computational notebook app as support. This study evaluated computational notebook support for non-CS students from multiple perspectives. In order to evaluate the progress of the students in terms of competence in CT, we carried out a pretest and a post-test composed of problem-solving and programming questions. Students were also monitored during live lectures and self-directed laboratory sessions, allowing us to observe not only differences between real-time and replayed student engagement, but also influences of motivational aspects and gamified feedback on laboratory performance and learning outcome. We conclude by noting that computational notebooks can support active learning scenarios for promoting CT skills in non-CS students, that engagement with computational notebooks is associated with student motivation, and that gamification can contribute to increased engagement with the computational notebook. Finally, this article underlines the importance of continuing to investigate methods to engage people with little apparent interest in CT with active learning, computational notebooks, and gamification mechanisms.