Skip to main content

Datasets collection

This study uses the Canvas Network dataset published by Harvard Dataverse data website in 2016, which is widely used in the research of educational data mining and recommendation system26. This dataset contains many kinds of information, including the basic attributes of online courses, user portraits and user interaction behaviors, such as browsing, clicking, collecting and purchasing, which provides reliable data support for building an efficient personalized recommendation system. In order to ensure the correlation between the data and the research goal, the courses related to music education in the dataset are screened, and finally a sub-dataset containing 783 music courses is constructed. These courses cover many music learning fields, including but not limited to music theory, piano playing, guitar skills, vocal music training, composition and arrangement, etc., which provide a good data basis for studying the applicability of personalized recommendation system in different music learning scenarios. In the aspect of sample selection, in order to ensure the quality of data and the reliability of experiments, strict screening criteria are formulated. Firstly, in course selection, only courses with high user interaction are reserved, that is, courses with at least 50 users browsing or clicking records to ensure the validity of the data and ensure that the recommendation task can fully learn the correlation characteristics between courses. Secondly, in terms of user selection, in order to improve the generalization ability of the recommendation model, only active users who have participated in at least five courses and have a complete record of learning behavior are selected. This standard helps to eliminate those low-active users who have only a small amount of interactive data, thus reducing the impact of data noise on model training. In addition, in order to reduce the influence of data sparsity on the experimental results and ensure that the recommendation system can learn based on the latest interest trends, this study adopts the time interval screening standard, and only retains the interactive data in the past 12 months to ensure that the model can better adapt to the dynamically changing learning interests of users.

In the aspect of feature construction, 9 core features are selected in the experiment, including five attributes related to the course, such as course category, course difficulty, course duration, instructor, course score, age, study preference and study duration, and the user’s historical behavior sequence (including browsing, clicking and collecting). These features are pre-processed to improve the learning effect of the model. The main label in the experiment is whether the user browses a specific course, so that the recommendation model can be supervised and trained. For classification features, barrel coding is used to transform string features into discrete values for model processing. In addition, considering that the problem of time crossing may lead to the false high evaluation results of the model, thus misleading the model decision-making, the time dimension is especially considered when dividing the training set and the test set. Specifically, the division of training set and test set is based on time sequence, with the first 80% of dataset as training set and the remaining 20% as test set. Finally, 62,400 user-course interaction records are obtained for model training and 15,600 user-course interaction records for model evaluation.

In order to evaluate the contribution of different features to the course recommendation effect, Shapley Additional Explanations (SHAP) method is used to explain the model to quantify the influence of each feature on the recommendation decision. SHAP value is used to measure the contribution of each feature to the final prediction result. The greater the absolute value, the higher the importance of this feature to model decision-making. Table 2 shows the ranking of SHAP average contribution value of each feature.

Table 2 SHAP value ordering of different characteristics.

The experimental results show that the user’s historical behavior sequence plays a central role in the course recommendation task, mainly because this feature can directly reflect the evolution trend of students’ interests and help the model accurately capture the individual needs of users. Course categories and learning preferences also have a great influence on the recommendation results, indicating that students usually refer to their own areas of interest and study habits when choosing courses. In addition, learning duration, as an important indicator to measure users’ learning engagement, also contributes greatly to the recommendation effect. However, the course score and the SHAP value of the instructor are low, which shows that students are more inclined to make decisions based on their own interests and study habits when choosing online music courses, rather than relying on external evaluation of courses or teacher information. Therefore, when optimizing the recommendation model, people should focus on user behavior characteristics and personalized interest modeling to improve the accuracy and adaptability of the recommendation system.

In the task of personalized recommendation, different recommendation methods have significant differences in feature utilization, user interest modeling and computational efficiency. In order to verify the validity of CRM-SLIE model, this study selects five representative recommendation methods for comparison, including neural collaborative filtering (NCF)27 and Deep & Cross Network (Deep & Cross)28, GRU for Recommendation (GRU4Rec)29, Sequentially Transferred Attention Model for Personalized Recommendation (STAMP)30 and Lightweight Self-Attention Networks (LightSANs)31. The characteristics of each model are as follows: NCF is a collaborative filtering method based on deep learning, which uses neural network to automatically learn the implicit representation of users and items. But it has limitations in dealing with long sequence behaviors. Deep&Cross adopts the method of combining explicit cross features with depth features, which improves the ability of feature interaction, but it is weak in dynamic sequence modeling. GRU4Rec uses GRU to model user behavior sequence, which has good effect in processing time series data, but the calculation cost is large. STAMP’s sequence recommendation method based on attention mechanism can dynamically adjust users’ interests, but its modeling ability for long sequences is limited. LightSANs’ self-attention recommendation model based on Transformer structure can efficiently model the long-term interest evolution, but the computational complexity is high. In contrast, CRM-SLIE model combines interest evolution modeling and attention mechanism, which can effectively capture the long-term interest changes of users and strike a good balance between computational efficiency and recommendation quality. Taking Area Under the Curve (AUC) and Recall@k as evaluation indexes, where @k represents the recommendation of the top k courses. The equation is as follows:

$$\:AUC=\frac{\sum\:_{j\in\:positive}{rank}_{j}-P\times\:\frac{P+1}{2}}{P\times\:N}$$

(21)

$$\:Recall@k=\frac{TP}{TP+FN}$$

(22)

\(\:{rank}_{j}\) is the ranking of positive sample j. P is the number of positive samples, that is, the number of courses clicked by users. N is the number of negative samples, that is, the number of courses that the user did not click. TP is a real example, that is, the number of courses correctly recommended by the model and clicked by users. FN is false negative, that is, the model is not recommended or recommended incorrectly, but it is actually the number of courses that users are interested in.

Experimental environment and parameters setting

The experimental environment and parameter settings are shown in Table 3.

Table 3 Experimental environment and parameter settings.

After experimental verification, the key hyperparameters are optimized and adjusted to improve the performance and stability of the model. Specifically, the learning rate is set to 0.001, which is within the common setting range of deep learning recommendation system (0.0001 ~ 0.01). By comparing the effects of different learning rates (0.01, 0.005, 0.001, 0.0005) through experiments, it is finally determined that 0.001 can achieve the best balance between convergence speed and stability. The number of training rounds is set to 10. After experimental observation, after 10 rounds, the performance of the model tends to be stable. Continuing to increase the number of training rounds has limited improvement on AUC and may lead to over-fitting. The batch size is set to 256, and four batch sizes of 64, 128, 256 and 512 are tried in the experiment. The results show that 256 can achieve a good balance between training efficiency and model performance. The optimizer adopts Adam, because it can effectively adapt the learning rate and perform well in non-convex optimization problems, especially suitable for recommending the deep learning model of the system. The loss function selects binary cross entropy, which is mainly used for binary recommendation tasks to measure the matching degree between recommended courses and users’ actual behavior. Through the above-mentioned parameter optimization, the balanced performance of CRM-SLIE model in convergence speed, generalization ability and recommendation effect are ensured, which guarantees the reliability of subsequent experimental results and the applicability of the model.

Performance evaluation

The influence of embedding dimension on model performance

The length of student behavior sequence is set to 15, and the AUC values of different models are compared under the embedding dimensions of 32, 64 and 128, and the results are shown in Fig. 3.

Fig. 3
figure 3

AUC results of different models under different embedding dimensions.

In Fig. 3, when the embedding dimension is 64, the AUC value of CRM-SLIE model is the highest, reaching 0.871, which is superior to other comparative models. When the embedding dimension is 128, the AUC value decreases slightly, which indicates that increasing the embedding dimension in a certain range can improve the performance of the model, but when the dimension is too large, information noise may be introduced, which will affect the performance of the model. On the whole, the performance of CRM-SLIE model in different dimensions is always better than other models, which shows that the model has stronger adaptability and robustness in the task of course recommendation. This may be because CRM-SLIE adopts GRU based on attention mechanism, which can capture the dynamic changes of students’ interest state more effectively and weight historical behaviors, making the optimization of embedded dimensions more accurate. When embedded in 64 dimensions, the model can better express the information of courses and users’ interests, and avoid the over-fitting problem caused by too high dimensions, thus improving the AUC value.

The influence of sequence length on model performance

The embedding dimension is selected as 64, and the length of student behavior sequence is set as 5, 8, 10, 15, 20, 25 and 30. The AUC of the model under different sequence lengths is compared, and the results are shown in Fig. 4.

Fig. 4
figure 4

AUC results of different models under different sequence lengths.

Figure 4 shows that the AUC value of CRM-SLIE model keeps rising with the increase of the length of student behavior sequence, and tends to be stable after the sequence length is 20, reaching 0.872, which is the highest among all models. This shows that CRM-SLIE can effectively capture the evolution of students’ long-term interests, and it is insensitive to the change of sequence length and maintains high performance. The reason is that the CRM-SLIE model can automatically select the most valuable historical behavior for the current recommendation by weighting the students’ historical behavior through the AGRU module, so that the interest drift effect of long sequences can be effectively suppressed. In addition, the model gives different weights to different historical behaviors through the attention mechanism, so it can still maintain a high AUC value under the condition of long sequence, without information redundancy or gradient disappearance due to long sequence.

The influence of recommended quantity on model performance

The embedding dimension is 64 and the length of student behavior sequence is 30, and the recall rate of each model under different recommended numbers is evaluated. The Recall@k changes of different models are shown in Fig. 5.

Fig. 5
figure 5

Recall@k comparison of different models.

In Fig. 5, with the increase of recommendation number k, the recall rate of CRM-SLIE model is gradually better than other comparison models, indicating that its performance in recommendation tasks has been significantly improved with the increase of recommendation number. Especially when the k value is large (such as Recall@10), the recall rate of CRM-SLIE keeps ahead, which shows that the model can better capture the changes of students’ interests and provide recommendations that match the actual needs of students. This may be due to the AGRU mechanism adopted by CRM-SLIE combined with attention weight adjustment, which can evaluate the importance of different historical behaviors, so that the model can maintain a high recall rate under different recommendation numbers. In addition, CRM-SLIE model combines dynamic interest evolution in recommendation, which can update students’ interest status in real time and make the recommendation results more accurate, so it is superior to other comparison models in Recall@5 and Recall@10.

Performance comparison of different models

The embedding dimension is 64 and the length of student behavior sequence is 30. Under the same other conditions, the results of AUC, Recall@5 and Recall@10 of each model are compared, as shown in Fig. 6.

Fig. 6
figure 6

Performance comparison results of different models.

The results in Fig. 6 show that, under the same conditions, CRM-SLIE achieves an AUC value of 0.872, significantly higher than other models. It also demonstrates a notable lead in Recall@5 (0.262) and Recall@10 (0.364). Specifically, although GRU4Rec and STAMP have slightly higher AUC values than CRM-SLIE, CRM-SLIE significantly outperforms these models in Recall indicators. This indicates that CRM-SLIE can more accurately identify user interests and provide relevant courses in recommendation tasks. Additionally, compared to LightSANs, CRM-SLIE has a slightly higher AUC and a more pronounced advantage in Recall@10. This suggests that CRM-SLIE adapts better to increasing recommendation quantities. The main reason lies in CRM-SLIE’s use of a self-attention mechanism to capture the evolution of student interests. This enables the model to better understand long-term interest changes and make precise recommendations. At the same time, CRM-SLIE optimizes the processing of student behavior sequences, maintaining high performance even with longer sequences.

Ablation experiment

In order to verify the influence of PE and project crossing on the model performance, an ablation experiment is designed. The comparison is conducted through two experimental settings: with and without the PE module, and different item interaction methods. The PE module setting includes the PE module, enabling the capture of temporal evolution in student interests. The without PE module setting excludes the PE module, simulating the model’s performance without temporal information capture. The item interaction methods are divided into four scenarios: no interaction features, inner product interaction, Hadamard product interaction, and a combination of inner product and Hadamard product interaction. No interaction features mean no feature interaction is used, and only original features are used for modeling. Inner product interaction uses the inner product method to simulate the model capturing relationships between courses through inner products. Hadamard product interaction uses the Hadamard product method to simulate the model capturing relationships between courses through Hadamard products. Combination of inner product and Hadamard product interaction combines both methods to enhance the model’s ability to capture complex relationships between courses. The result is shown in Fig. 7.

Fig. 7
figure 7

Result of ablation experiment.

Figure 7 shows that PE and project crossing mode have obvious influence on model performance. After adding the PE module, the model’s AUC value and Recall@10 improve compared to when the PE module is not included. This result shows that the PE module effectively captures the temporal evolution of student interests, enhancing the model’s ability to model long-term interest changes. Without the PE module, the model cannot fully consider the dynamic nature of student interests over time, leading to a decline in recommendation performance. Regarding the impact of item interaction methods, the model’s performance significantly drops when no interaction features are used. This indicates that relying solely on original features for recommendations cannot fully capture the complex relationships between courses. Item interaction methods effectively improve model performance. The combination of inner product and Hadamard product interaction achieves the best results. This suggests that the combination better captures complex relationships between courses, ultimately significantly enhancing the model’s recommendation effectiveness. The combined interaction method effectively integrates the advantages of inner product and Hadamard product, providing richer feature interactions and optimizing model performance. The ablation experiment results demonstrate that the PE module and item interaction methods significantly impact model performance. Adding the PE module notably improves the model’s ability to capture temporal dynamics. Different interaction methods vary in their ability to capture complex relationships between courses, with the combination of inner product and Hadamard product interaction performing the best and delivering the greatest performance improvement. These results indicate that combining the PE module with effective feature interaction methods is key to improving course recommendation accuracy.

Analysis of running time and calculation cost

In order to evaluate the computational efficiency of CRM-SLIE model and its scalability under different data scales, the training time and inference time of CRM-SLIE and other models in the same experimental environment are compared, and the results are shown in Fig. 8.

Fig. 8
figure 8

Training time and inference time of different models.

From the experimental results in Fig. 8, the CRM-SLIE model shows a good balance in training time and inference time. Compared with GRU4Rec, STAMP and LightSANs, the training time of CRM-SLIE is shorter, only 85.3 s, which shows that its optimized interest evolution mechanism improves the computational efficiency. At the same time, the inference time of CRM-SLIE is 0.012 s/batch, which is better than GRU4Rec and LightSANs, indicating that its recommendation speed is faster. This study analyzes the reasons for the better calculation efficiency of CRM-SLIE model, and finds that the introduction of attention mechanism can effectively reduce redundant calculation when the model captures the dynamic changes of interest. At the same time, the structural optimization of CRM-SLIE model avoids the computational overhead caused by too deep network and improves the scalability of the model on large-scale data.

Model performance analysis

To further validate the effectiveness of the CRM-SLIE model, this section presents specific recommendation examples and analyzes the model’s performance in different scenarios. Several students’ behavior sequences are selected to demonstrate the model’s strengths and weaknesses in recommendation accuracy.

In a typical recommendation task, the CRM-SLIE model successfully recommends courses highly aligned with a student’s interests and learning progress. For a student performing well in mathematics, the model recommends several advanced math courses. These courses are closely related to the student’s past learning trajectory, including subjects like advanced algebra and calculus. By capturing the student’s long-term interest in mathematics and considering their learning progress, the model effectively recommends courses that matched the student’s interests. However, the CRM-SLIE model does not always provide ideal recommendations. In some cases, the model’s performance is less satisfactory. For students with diverse interests and scattered learning trajectories, the model’s recommendations may not meet expectations. For example, the model recommends several computer science courses to a student whose main interests were in humanities, such as literature and history. In this case, the model fails to effectively capture the student’s interest changes, leading to recommendations that did not align with the student’s actual needs. The root cause of these performance differences may lie in the diversity and dynamic nature of student interests. In some cases, students’ interests are too broad or change too quickly for the model to accurately capture these shifts. Although CRM-SLIE attempts to capture long-term interest evolution through the PE module, it may struggle to extract representative learning interest features when students’ learning trajectories are highly scattered, affecting recommendation accuracy. Additionally, the model’s recommendation effectiveness may be influenced by the course content itself. If the course content has low relevance to the student’s learning trajectory or potential interests, even if the model captures some interest features, the lack of highly relevant course data may result in poor recommendations. Therefore, further improving the model’s interest fusion mechanism in multi-interest domains could enhance its recommendation accuracy in complex interest change scenarios.

Discussion

In a word, the proposed CRM-SLIE course recommendation model based on the evolution of students’ interests can effectively capture the long-term changes of students’ interests and improve the accuracy of personalized recommendation. Compared with the existing research, the model in this study has stronger adaptability and robustness in dealing with the dynamic changes of students’ learning interests and complex course relations. Wang et al. (2021) proposed a recommendation method based on graph neural network to solve the shortcomings of existing models in explicit expression of project structural relationship and project timeliness. The experimental results showed that this model had obvious advantages in improving recommendation performance and accurately predicting learners’ preferences32. However, the computational cost of graph neural network is large, especially in the case of large data, and the time cost of model training and reasoning may increase significantly. By introducing attention mechanism and GRU structure, CRM-SLIE can capture the changes of students’ interests, effectively reduce the calculation cost and maintain high recommendation efficiency. In addition, CRM-SLIE has obvious advantages in modeling the long-term evolution of students’ interests, which can better adapt to the changes of students’ dynamic interests and has strong robustness. Jena et al. (2022) proposed an e-learning course recommendation system based on collaborative filtering mechanism, which used models such as K-nearest neighbour, singular value decomposition and collaborative filtering based on neural network to provide course selection suggestions according to users’ preferences. The experimental results showed that K- nearest neighbour performed best in hit rate and average reciprocal hit rate, with the lowest mean absolute error33. Although the K-nearest neighbors algorithm in this study performs well on some traditional metrics, collaborative filtering methods suffer from data sparsity issues and cannot effectively handle the dynamic changes in student interests. In contrast, CRM-SLIE excels in capturing the long-term evolution of student interests. It avoids the limitations of collaborative filtering in sparse data scenarios, providing more accurate personalized recommendations. Safarov et al. (2023) proposed a deep neural network recommendation method combining synchronous sequence and heterogeneous features to improve the recommendation accuracy of e-learning platform. The results showed that the recommendation accuracy of this method in Top-1 and Top-5 courses were 0.626 and 0.492 respectively34. This method improves recommendation accuracy by combining different features, but it may face challenges in feature engineering and multi-modal data fusion. Additionally, the model’s complexity is high, leading to longer training and inference times. In contrast, CRM-SLIE combines attention mechanisms and GRU. It efficiently captures student interest changes through a simplified network structure while reducing reliance on features. This makes the model more efficient in recommendation tasks while maintaining high recommendation accuracy. In contrast, CRM-SLIE can better capture students’ learning trajectory and interest changes by combining attention mechanism with GRU, thus achieving more accurate personalized recommendation. Additionally, the efficiency and adaptability of CRM-SLIE enable it to maintain strong performance across datasets of different sizes. This demonstrates greater robustness compared to other models.

Source link

Subscribe our Newsletter

Congratulation!