www.rsisinternational.org
Page 664
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Predicting Students’ Final Scores in An Advanced Grammar Course
using Multiple Linear Regression
Faizah Mohamad, Mazura Anuar, Laura Christ Dass
*
, Asha Latha Bala Subra mainam
UiTM Shah Alam Selangor
*
Corresponding Author
DOI: https://dx.doi.org/10.47772/IJRISS.2025.924ILEIID0070
Received: 23 September 2025; Accepted: 30 September 2025; Published: 31 October 2025
ABSTRACT
Continuous assessments encourage sustained learning that contributes to deeper knowledge retention and
academic success. Investigating continuous assessments, such as quizzes and tests, is important in predicting
final scores because they provide a reliable measure of students’ learning progress across the semester. The
present study examined the extent to which quiz scores, and test scores were able to predict final scores in a
grammar course using multiple linear regression analysis. Data were collected from 223 first-semester students
enrolled in the course. Preliminary analyses confirmed that the assumptions of multiple linear regression were
met by looking at linearity, normality, reliability of measurement and homoscedasticity. The results revealed
that quiz scores and test scores were positively associated with final scores, indicating that students who
performed well in formative and summative assessments were more likely to achieve higher final scores. The
results also showed that both quiz scores and test scores significantly predicted final scores with a substantial
proportion of variance in students’ final achievement. A regression model was developed using the SPSS
software and the formulated model provides useful insights for educators in identifying early indicators of
students’ final performance and in designing instructional strategies that support academic achievement,
ultimately enhancing the overall quality of instruction.
Keywords: (advanced grammar, continuous assessment, formative assessment, summative assessment,
multiple linear regression)
INTRODUCTION
The importance of having a good grasp of the English language in particular amongst English language majors
in the university is increasing as institutions participate in the race to achieving global ranking. Central to this
is the role played by grammar in English Language Studies where Advanced Grammar is a compulsory
subject. Advanced grammar courses are often challenging for students, requiring consistent reinforcement and
practice. Quizzes, tests and oral presentations in these courses are designed to help learners master the various
grammar components as well as track their understanding and application of complex grammatical structures.
As such, exploring the relationship between the formative assessments and summative assessments is crucial
for identifying students at risk of failing and to consolidate pedagogical practices. Multiple linear regression
allows researchers to evaluate the combined effect of multiple independent variables, offering a more accurate
predictive framework between formative assessments which act as predictors and the outcome which is
summative assessments. Prior studies have highlighted the usefulness of such models in mathematics (Darman
et al., 2019), in overall first year performance (Dagdagui, 2022), and language education forecasting
achievement (Khiang & Cho, 2019). However, relatively few studies have examined advanced grammar
courses specifically, despite their significance in language proficiency development. This study aims to address
that gap by investigating the predictability of grammar quizzes and test scores on the final exam among first
semester English language major undergraduates enrolled in Advanced Grammar course at UiTM. Specifically,
it aims to (i) develop a multiple regression model to predict final scores and (ii) compare predicted outcomes
with actual results. Findings from this research provide evidence for the effectiveness of continuous
assessment as a reliable
www.rsisinternational.org
Page 665
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
predictor of academic achievement.
The objectives of the study are:
1. to develop a multiple regression model in predicting the students’ final grammar scores based on their
performance on continuous assessments which are the quiz and test.
2. to compare the actual final grammar scores and the predicted ones based on the developed regression
model.
LITERATURE REVIEW
The Role of Assessment in Language Education Proficiency in English grammar is a cornerstone of success for
English language majors, particularly in the context of competitive global higher education (Kasim & Sukarno,
2024). Within this domain, advanced grammar courses are often challenging, requiring consistent practice and
robust assessment mechanisms to track mastery of complex structures. Assessment, therefore, is not merely an
endpoint but an integral part of the learning process. It serves to diagnose understanding, provide feedback,
and ultimately, certify competence. As noted by Rodríguez Rincón, Munárriz and Magreñán Ruiz (2024),
continuous assessment is valued for its ability to motivate students and provide an objective basis for final
evaluation, making it a critical tool in language education.
Formative and Summative Assessment: Purposes and Interrelationships
Educational assessment is broadly categorized into formative and summative types. Formative assessments,
such as quizzes, are low-stakes evaluations designed to provide ongoing feedback to students and instructors
during the learning process. Their primary purpose is to reinforce learning and identify areas needing
improvement (Hanna & Dettmer, 2004). In contrast, summative assessments, like final exams or major tests,
aim to evaluate student learning at the end of an instructional unit by comparing it against a standard or
benchmark. The relationship between these two is crucial; formative assessments are theorized to build the
foundational knowledge and skills that are ultimately measured through summative assessments. Effective
learning environments strategically use formative tasks to prepare students for summative success.
The Predictive Validity of Continuous Assessment on Academic Achievement
A growing body of research supports the notion that performance on continuous, formative assessments can
significantly predict final academic outcomes. For instance, Dagdagui (2022) used multiple linear regression to
demonstrate that high school performance and admission test scores could predict 67.3% of the variance in
first-year university students' grades. This finding is reinforced in a specific subject context by Darman et al.
(2019), who successfully predicted students' final grades in a mathematics module using scores from
continuous assessments. Most directly relevant to the current study, Madsen (2020) investigated a university-
level grammar course and found that results from continuous exercises were a strong predictor of final exam
results using linear regression. These studies collectively affirm that formative assessment performance is a
reliable early indicator of summative achievement.
Multiple Linear Regression in Educational Forecasting
The methodological approach of using Multiple Linear Regression (MLR) is well-established in educational
research for forecasting student performance. MLR is a powerful statistical technique that allows researchers to
evaluate the combined effect of several independent variables (e.g., quiz scores, test scores) on a dependent
variable (e.g., final exam score) (Uyanık & Güler, 2013). Its application extends beyond education to fields
like biological assays, underscoring its robustness for quantitative prediction (Jarantow et al., 2023). When
applying MLR, it is critical to test its key assumptions, including linearity and the absence of
multicollinearitya condition where predictor variables are too highly correlated, which can be detected using
statistical measures like the Variance Inflation Factor (VIF) (Shrestha, 2020). Studies by Khaing and Cho
(2019) have effectively utilized MLR in educational contexts, validating it as a suitable method for this study.
www.rsisinternational.org
Page 666
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Gaps in the Literature: The Specific Case of Advanced Grammar Courses
Despite the established predictive validity of continuous assessments and the proven utility of MLR, a specific
gap remains concerning advanced grammar courses in English language programs. While studies like
Madsen's (2020) exist, they are relatively scarce compared to research in mathematics (Darman et al., 2019) or
on general academic performance (Dagdagui, 2022). Furthermore, many studies in language education focus
on affective factors like anxiety and its correlation with skills such as speaking (Kasim & Sukarno, 2024; Nety
& Purnomo, 2023), rather than on the predictive power of objective assessment data within a grammar
curriculum. Therefore, this study aims to address this gap by developing and validating a multiple linear
regression model specifically to predict final scores in an Advanced Grammar course based on continuous
assessment marks, thereby contributing a data-driven perspective to this critical area of language proficiency
development.
METHOD
This study employs multiple linear regression analysis to see the relationships between the quiz and test
(independent variables) and the final grammar scores (dependent variable) obtained from 223 students who
enrolled in an advanced English grammar course. In addition, the analysis also shows the variation of the
independent variables accounted for in the dependent variables (final grammar scores) and also produces
predictive models (Uyanik & Güler, 2013). The predictive model for this study is as follows:
Y = β
0
+ β
1
X
1
+ β
2
X
2
where
X
1
: Quiz scores
X
2
: Test scores
and β's denote the regression coefficients.
To ensure the assumptions of multilinear regression analysis are met, linearity, normality, reliability of
measurement, and homoscedasticity are analysed before multilinear regression analysis is performed.
RESULTS AND DISCUSSION
Linearity
Multiple regression can effectively estimate the relationship between the independent variable(s) and the
dependent variable, provided that their association is linear. Test of linearity was used in determining the
linearity between these variables. Table 1 and Table 2 below show that the linearity exists between each
variable (quiz and test) and the dependent variable (the final grammar scores).
Table 1 Linearity test (Final grammar scores* Quiz)
Sum of Squares
df
F
Sig.
Between
Groups
(Combined)
12437.8
39
13.5
.001
Linearity
11325.2
1
480.6
.001
Deviation from Linearity
1112.56
38
1.243
.175
Within Groups
4312.09
183
Total
16749.9
222
www.rsisinternational.org
Page 667
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Table 2 Linearity test (Final grammar scores* Test)
Sum of Squares
df
Mean Square
F
Sig.
Between Groups
(Combined)
12343.
42
293.890
12.00
.001
Linearity
10917.
1
10917.4
445.9
.001
Deviation from
Linearity
1425.8
41
34.778
1.421
.063
Within Groups
4406.5
180
24.48
Total
16749.9
222
Table 1 and Table 2 show that the p values for deviation from linearity for quiz is .175 and for test is 0.063
respectively which are above 0.05. The insignificant results indicate that there is a clear linear relationship
between the independent variables, quiz and test, and the dependent variable, final grammar scores. According
to Kasim and Sukarno (2024) and Nety and Purnomo (2023), a linear relationship between variables is
determined when the divergence from linearity exceeds a value of 0.05.
Normality
The regression model assumes that the variables follow a normal distribution. Non-normal distributions may
misrepresent relationships and affect the accuracy of significance testing. To assess normality, the
Kolmogorov-Smirnov and Shapiro-Wilk tests were conducted, as shown in Table 3 below. The results indicate
significance values of 0.087, 0.200 and 0.200 (all p values are > 0.05) for the Kolmogorov-Smirnov test and
0.429, 0.907 and 0.193 (all p values are > 0.05) for the Shapiro-Wilk test, suggesting that all variables under
investigation are normally distributed. These findings are further supported by the Q-Q plot for quiz, test and
final grammar scores (see Figure 1, Figure 2 and Figure 3), where the data points align closely with the
diagonal line, reinforcing the assumption of normality. Darman et al. (2023) and Kasim and Sukarno (2024)
stated that a normality test with the p-value more that 0.05 signifies the normal data distribution.
Table 3 Tests of normality
Kolmogorov-Smirnov
a
Shapiro-Wilk
Statistic
df
Sig.
Statistic
df
Sig.
QUIZ
.056
223
.087
.993
223
.429
TEST
.032
223
.200
*
.997
223
.907
FINAL GRAMMAR SCORES
.046
223
.200
*
.991
223
.193
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction
Figure 1 Q-Q plot for quiz
www.rsisinternational.org
Page 668
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Figure 2 Q-Q plot for test
Figure 3 Q-Q plot for final grammar scores
Reliability of measurements
Multiple linear regression assumes the absence of multicollinearity among the independent variables.
Multicollinearity arises when the predictors are highly correlated with one another (Darman et al., 2019). This
assumption can be assessed using Pearson’s bivariate correlation. As shown in Table 4 below, the correlation
between quia and test is 0.416, which is below the threshold of 0.80. According to Shrestha (2020), if the
absolute value of the Pearson’s correlation coefficient is well below 0.80, collinearity is unlikely to exist. Thus,
the analysis confirms that the assumption of measurement reliability is met.
www.rsisinternational.org
Page 669
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Table 4 Correlation
QUIZ
TEST
QUIZ
Pearson Correlation
1
.416
**
Sig. (2-tailed)
<.001
N
223
223
**. Correlation is significant at the 0.01 level (2-tailed).
Homoscedasticity
A scatter plot provides an effective method for assessing homoscedasticity. It allows researchers to examine
whether the residuals are evenly distributed across the regression line (Jarantow, et al., 2023). Figure 4 below
shows that most of the data are in the range of 2 and -2 which suggests that the assumption of
homoscedasticity is met (Darman et al., 2019).
Figure 4 Homoscedasticity scatterplot
The findings above show that the four assumptions of multiple regression are fulfilled, thus, the independent
and dependent variables are appropriate for regression model development.
Regression model development
Table 5: Model Summary
Model
R
R
Square
Adjusted R
Square
Std. Error of the
Estimate
1
0.868
0.753
0.752
2.17615
a. Predictors: (Constant), TEST, QUIZ
b. Dependent Variable: FINALGRAMMARSCORES
www.rsisinternational.org
Page 670
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
Table 6: ANOVA
Model
Sum of Squares
df
Mean
Square
F
Sig.
Regression
15708.06
2
7854.03
1658.4
0.001
Residual
1041.842
220
4.736
Total
16749.91
222
a. Dependent Variable: FINALGRAMMARSCORES
b. Predictors: (Constant), TEST, QUIZ
Table 7: Coefficients
Model
Unstandardized
Coefficients (B)
Std.
Error
Standardized
Coefficients
(Beta)
t
Sig.
(Constant)
18.298
0.814
22.49
0.001
QUIZ
1.237
0.039
0.588
31.8
0.001
TEST
1.042
0.034
0.563
30.42
0.001
a. Dependent Variable: FINALGRAMMARSCORES
Table 5 reveals that the multiple correlation coefficient (R) is 0.868, indicating a strong positive relationship
between the independent variables (Quiz and Test) and the dependent variable (Final Grammar Scores). The
coefficient of determination (R²) is 0.753, which means that 75.3% of the variance in the final grammar scores
can be explained by quiz and test, while the remaining 24.7% is attributable to other factors not included in the
model. These results suggest that the independent variables have strong predictive power for students’ final
grammar scores.
The ANOVA results presented in Table 6 show a p-value of 0.001, which is below the 0.05 threshold, which
confirms that students’ grammar performance is significantly influenced by their performance in quiz and test
scores. Based on the analysis in Table 6, the multiple linear regression model can be expressed as:
Predicted Final Grammar Scores=18.298 + 1.237(Quiz) + 1.042 (Test)
Thus, it can be concluded that quiz and test can significantly predict students’ final grammar scores, with
regression coefficients of 1.237 and 1.042, respectively.
Using multiple linear regression in developing regression model to predict the outcomes has been conducted in
many studies, eg. Dagdagui (2022) and Khiang and Cho (2019) in predicting the students’ academic
performance, Darman et al. (2019) in predicting Math scores, and Oflaz (2019) in predicting speaking skills.
Therefore, this technique is deemed applicable to be employed in predicting the final grammar scores of the
students.
Comparing observed and predicted final grammar scores
Table 7 compares the predicted and observed final grammar scores for 15 selected students. The relatively
small differences (below 2 marks) the predicted and actual scores suggest that the regression model provides
www.rsisinternational.org
Page 671
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
an adequate fit and is appropriate for use. The results of this study are in line with Darman et al.s (2019) and
Khiang and Cho (2019), however, the differences between the observed and predicted scores in 3 marks and 4
marks respectively. In conclusion, the regression model provides a good fit, with differences below 2 marks
indicating stronger accuracy and better predictive performance compared to the larger variations of 3 to 4
marks reported in previous studies. This suggests that the model used in this study offers a more precise
estimation of scores.
Table 7 Comparison between observed and predicted final grammar scores
Std
Quiz
Test
Obsrvd FG Scores
Prdctd FG Scores
Diff.
1
10.50
18.00
50.00
50.04
-0.16
2
12.00
23.00
56.00
57.15
-1.45
3
14.00
23.50
59.50
60.10
-0.60
4
22.50
27.50
75.00
74.78
0.22
5
23.00
30.50
80.00
78.53
1.47
6
21.50
21.00
65.00
66.78
-1.78
7
18.50
20.50
61.50
62.54
-1.04
8
17.00
23.00
65.00
63.29
1.71
9
16.50
22.50
63.00
63.10
0.10
10
19.00
30.50
74.50
73.50
1.00
11
11.50
11.50
44.00
44.23
0.23
12
24.50
35.00
85.00
85.08
0.08
13
12.00
25.50
60.50
59.75
0.75
14
17.00
25.00
66.00
65.38
0.62
15
23.00
29.00
77.00
76.97
0.03
CONCLUSION
Using multiple linear regression, this study examined the extent to which the results of formative assessments
are able to predict students’ performance in the summative assessment. The study found a strong relationship
between the two assessment types explaining 75.3% of the variance. The close alignment between actual and
predicted scores, with differences generally below two marks, highlights the reliability of the regression model
as a practical tool for forecasting academic outcomes. These findings are consistent with those outlined in
literature, it also emphasises the importance of continuous assessment not only as a means of tracking student
progress but also as a reliable indicator of their achievement in the summative assessment. This model is
indeed a valuable framework that helps to identify underperformers, it enables educators to undertake remedial
steps with a degree of confidence. For curriculum developers, the study underscores the need to design
meaningful quizzes and tests that both facilitate learning and provide predictive insights into student
performance. Beyond classroom practice, the research contributes to the wider field of educational analytics by
demonstrating the applicability of regression models in language learning contexts. While literature exists on
similar studies carried out in subjects such as mathematics and general academic performance, this study adds
evidence from the subject of Advanced Grammar, a critical but less-explored domain in English language
education. In conclusion, this study shows that multiple regression is an effective method for predicting
grammar achievement, confirming the strong predictive value of continuous assessments and supporting data-
driven, student-centred teaching practices. The study while acknowledging its limitations in scope, proposes
that similar studies in the subject of English Language courses particularly in the area of Grammar to enhance
the generalisability of the findings. Future research could incorporate additional independent variables such as
www.rsisinternational.org
Page 672
ILEIID 2025 | International Journal of Research and Innovation in Social Science (IJRISS)
ISSN: 2454-6186 | DOI: 10.47772/IJRISS
Special Issue | Volume IX Issue XXIV October 2025
gender, entry-level proficiency, and socio-economic background which will provide a more comprehensive
understanding of the factors influencing learning outcomes.
ACKNOWLEDGEMENTS
You can dedicate this section to give recognition and acknowledgement to those involved in your project
REFERENCES
1. Dagdagui, R. T. (2022). Predicting students’ academic performance using regression analysis. American
Journal of Educational Research, 10(11), 640646. https://doi.org/10.12691/education-10-11-2
2. Darman, H., Musa, S., Ramasamy, R., & Rajeswari, R. (2019). Predicting students’ final grade in
mathematics module using multiple linear regression. International Journal of Recent Technology and
Engineering, 7(5), 331335. https://www.ijrte.org/wp-content/uploads/papers/v7i5s/ES2162017519.pdf
3. Hanna, G. S., & Dettmer, P. A. (2004). Assessment for effective teaching: Using context-adaptive
planning. Boston, MA: Pearson A&B.
4. Jarantow, S. W., Pisors, E. D., & Chiu, M. L. (2023). Introduction to the use of linear and nonlinear
regression analysis in quantitative biological assays. Current Protocols, 3(6),
https://doi.org/10.1002/cpz1.801
5. Kasim, N., & Sukarno, S. (2024). The correlation between students’ anxiety and their speaking ability in
EFL classroom. International Journal of Multicultural and Multireligious Understanding, 11(10), 382.
DOI:10.18415/ijmmu.v11i10.6258
6. Khaing, Y. M., & Cho, A. (2019). Forecasting academic performance using multiple linear regression.
International Journal of Trend in Scientific Research and Development, 3(5), 10111015.
https://www.ijtsrd.com/papers/ijtsrd26517.pdf
7. Madsen, R. S. (2020). The learning curve: Can the results of the grammar exam be predicted? Globe: A
Journal of Language, Culture and Communication, 11, 4358.
https://journals.aau.dk/index.php/globe/article/view/6283/5537
8. Nety, N., & Purnomo, B. (2023). The correlation between students’ speaking anxiety and speaking ability
at SMA Negeri 4 Baubau. English Education Journal, 9 (1), 28-36.
a87fe3062edc67bcf3d6918c7c92c9d3fd2a.pdf
9. Oflaz, A. (2019). The effects of anxiety, shyness and language learning strategies on speaking skills and
academic achievement. European Journal of Educational Research, 8(4), 999-1011.
https://doi.org/10.12973/eu-jer.8.4.999
10. Rodríguez Rincón, Y., Munárriz, A., & Magreñán Ruiz, A. (2024). A new approach to continuous
assessment: Moving from a stressful sum of grades to meaningful learning through self-reflection. Social
Sciences & Humanities Open, 10(1). https://doi.org/10.1016/j.ssaho.2024.100986
11. Shrestha, N. (2020). Detecting multicollinearity in regression analysis. American Journal of Applied
Mathematics and Statistics, 8(2), 3942. https://doi.org/10.12691/ajams-8-2-1
12. Uyanık, G. K., & ler, N. (2013). A study on multiple linear regression analysis. Procedia Social and
Behavioral Sciences, 106, 234240. https://doi.org/10.1016/j.sbspro.2013.12.027