Sign up for our newsletter, to get updates regarding the Call for Paper, Papers & Research.
Grammatical Ability Assessment Software for ESL Students with Limited English Proficiency
Mr. Thomas Eric C. Paulin, LPT, MAEd
Colegio de San Juan de Letran
DOI: https://dx.doi.org/10.47772/IJRISS.2024.803010S
Received: 14 March 2024; Accepted: 02 April 2024; Published: 01 May 2024
The birth of the K to 12 curriculum in the Philippine education system adds the inclusion of English as core subjects in Senior High School. However, these English subjects redirect their focus away from the extensive teaching of grammar. Despite recent trends on the teaching of English in the communicative approach, various literatures agree that grammar is still the foundation of a second language acquisition. In order to determine the specific contents of a remediation program among the limited English SHS students of the locale, a grammatical ability assessment software is developed. Subscribing to the concept of weighted average, students’ frequency of errors in the different word classes serve as the basis for the assignment of weight transmutation used in the automated computation features of the software. Among the elected 100 limited English students of Lyceum of Alabang Inc., it is found out that their common mistakes are on verbs followed by adverbs. In average, the students also find difficulties in noun, adjective, preposition, and conjunction usages. Their least common errors are pronouns. The values assigned in the automated checking, recording, and reporting features of the software are based from these results. These inputs direct the development of the software. Later on, bugs and other software issues are fixed and finalized. After the software’s development, it is presented to the English faculty members of the locale the students as users to-be, and IT specialists, who then provided their evaluation. Results show that the software exhibits a highly satisfactory mean score with suggestions focusing on the adaptability of the software to mobile devices. Likewise, the results on the students’ taxonomy of errors in grammar manifest the recommendations for the content of remediation programs in English.
Background of the Study
In the birth of the K to 12 curriculum in the Philippines, two more years have been added to the framework of secondary education on top of the then four years in high school, paving for the design of senior high school (SHS). In these levels, more subjects are added to the curriculum to further prepare the Filipino students with authentic and significant skills. These subjects aim to serve specific functions conforming to the 21st century skills necessary for the learners of today. Subjects are categorized as core curriculum subjects, specialized subjects, and applied track subjects. English subjects in this curriculum are mostly in core curriculum subjects; however, they get away from the customary teaching of English, which is with the extensive focus on grammar rules and sentence construction. These include grammatical functions that are meant to be memorized and long lists of vocabulary learned by heart (Renau, 2016).
It is firmly postulated that language teaching to English language learners (ELL) promotes a communicative approach where grammar rules are set aside. Recent trends in communicative language teaching suggest that several language teachers see that learning grammar when learning English is no longer necessary (Ly, 2020). However, Saaristo (2015) suggested the notion that communication without learning grammar stems from a perception of the contrast between lexis and grammar where grammar is given lesser communicative and interactional role. Saaristo further forwarded that lexis and grammar should share communicative responsibilities and should not be separated.
When teaching ELLs, English teachers simply tend to make the learners memorize grammar rules. Eventually, the learners will commit mistakes in writing, such as writing sentence fragments, incorrect punctuations, confusing modifiers, or collocation errors. Still, there are aspects in grammar that students need to apply when writing in English. Hence, English teachers need to consider again the role of grammar when writing in English. This includes the thinking of approaches in which grammar can be taught effectively (Phuwarat & Boonchukusol, 2020). A paradigm review is therefore a challenge for teachers handling English subjects in senior high school, specifically in Oral Communication, Reading and Writing, English for Academic Purposes, Creative Nonfiction, and even in 21st Century Literature from the Philippines and the World. The same limitations are likewise met in other subjects that require proficiency in English such as Practical Research and other specialized courses.
Lin in 2008 states that grammar “help(s) students discover the nature of language, i.e., that language consists of predictable patterns that make what we say, read, hear, and write intelligibly” (p.3). This magnifies the role grammar teaching and learning play in acquiring language mastery. In line with this, teachers can explain abstract grammatical terminology to help students write and read with better proficiency and confidence. Recognizing the labels assigned to each element of grammatical functions is therefore helpful to learners in using these functions in the right place and at the right time.
According to Zhang (2009), grammar as the foreground in second language teaching is absolute since grammar and vocabulary knowledge is the base of English language. Without it, communicative goal, which is the goal of learners’ studying English, will not be attained. So, grammar is still necessary to achieve the goals.
In line with the endeavor that pushes for the reinsertion of grammar teaching in the K to 12 curriculum is the need to assess and evaluate it. The measurement of students’ grammar ability has always been a challenge all together since it does not directly provide quantifiable data. In his educational video in 2019, Northbrook directly stated that measuring one’s grammatical ability in English is not advisable. In fact, Northbrook highly suggested that measuring proficiency should be disregarded. Instead, the focus should be on the process of improvement. This is because, although there are ways to measure proficiency, it is still “notoriously difficult” to do so.
Meanwhile, it has been a trend that language testing is to separate testing language proficiency in large scales for admission or certification purposes (Durairajan, 2019). It is seen that formative assessments and other alternative assessment practices are more relevant than summative or final exams. Durairajan further discussed that tests and assessment tools in general are both integral and beneficial to teaching and in learning. The specific trend now is the shift from systematic standardized testing to teacher-created or contextualized tests.
Sims (2015) further emphasized in the study that although there are commercially available proficiency exams on the market, most of them, if not all, are costly, time consuming to conduct, and inappropriate to the setting of local language programs. That is why most schools have been developing their own language proficiency examinations to meet their specific needs and context.
Evangelisti (2020) emphasized that mastering English grammar is essential for everyone who wants to effectively speak and write in English. He added that grammar practice tests will help achieve this goal since they help one understand word classes distinctively, use punctuations correctly, avoid grammar errors consistently, and write cohesively. In the same site, he mentioned that there are tons of standardized tests in reading, English, and grammar.
For instance, British-Study.com (2021) has developed a standardized English grammar test for its taker to know how good they are in English. It is a short 40-item multiple choice test about English grammar. The test’s ideal administration time is ten to fifteen minutes, and the correct answers are provided at the end. However, the site where the test can be accessed has put a disclaimer that it is not an accurate placement test.
A similar 40-item multiple choice test is also developed by Oxford Online English (n.d.). In random order, the test covers all levels from elementary to advanced. Likewise, the correct answers will be revealed at the end of the test.
In addition, tests.com has made available a list of varying resources for grammar tests (2022). “Grammar testing is a basic tool to help teachers evaluate a student’s command of the English language,” according to their site. These tests may either be a standalone or a part of other language tests. The basic grammar test is similarly a multiple-choice test where the answers in fill-in-the-blanks questions are selected. Just like the previously cited grammar tests, a score is presented upon its accomplishment based on the number of correct items the test taker has garnered.
Another multiple-item grammar test is developed by Grammaring (2022). It has the same structure with the other grammar tests, only this time it has 50 items. The test covers grammar topics such as modals, verb tenses, conditionals, quoting speeches, passive voice, gerunds, infinitives, clauses, articles, types of nouns, and other facets of English.
The questions are arranged from easiest to most difficult. Like the first mentioned, this free grammar test online provides a statement that says it is not an accurate indication of one’s English grammar level and is not a basis for placements.
Taking these English grammar tests online into consideration, it is highly observed that most, if not all, of such tests are merely form-focused type of assessments and do not openly measure one’s English grammatical ability. They provide test scores according to the number of items they got correctly but are not indicative of how well learners can use the language in written or spoken discourses.
Lee, a senior marketing writer for TurnItIn.com, wrote in 2022 that multiple-choice type of examination does not assess learning accurately. For instance, if a student can narrow down choices to two possible answers without full understanding of concepts, there is a huge possibility that the correct answer may be picked by chance. Lee further adds that multiple-choice tests rely only on recall and not on higher-order-thinking skills. Here, the students usually do not necessarily display understanding of concepts when selecting a response. Nonetheless, the depth of questioning through multiple-choice test types may be improved by deepening both the stem and distractors. This may likewise be done by targeting the higher levels of the revised Bloom’s taxonomy.
A study by Gakis et al. (2020) aimed to explore the standardization of templates for grammar errors to check on the respondents’ grammatical ability in the Greek language. This leads to the development of an electronic tool for recognizing and correcting grammatical errors. The study likewise explores its authentic applications in the classroom setting such as how it contributes to the teaching of mother tongue. As a result, it was found out that upon categorizing, errors mostly fall under grammar and mechanics. Nevertheless, the electronic tool finds it difficult to register other taxonomy of errors such as those on forms and semantics.
Koizumi et al., (2011) on another hand, developed a grammar test called EDiT Grammar (English Diagnostic Test of Grammar). This diagnostic test is for Japanese learners of English. It aims to determine the basic knowledge of its test takers on many aspects of grammar with a focus on noun phrases. Unlike other multiple-choice grammar test tools in which choices are limited to a word or two, EDiT Grammar’s choices are complete noun phrases. A “stem” is provided in each question; a complete sentence written in Japanese. The noun phrases in the choices aim to determine whether the test taker knows which among them is the correct grammatical English translation of the stem. Instead of making the learners write their own English sentences, this grammar test focuses merely on text translation in acquiring the second language (L2).
Although this research is focused on the development of a grammar testing tool, a teaching tool is also seen as integral basis for the undertakings of this study. Mantasiah, Yusri, and Jufri’s (2018) grammar teaching tool follows a procedure in such a way that grammar errors are analyzed and considered. The test scores of the students in translation and writing serve as a basis for the development of the teaching material. Despite being very different from the intended grammatical ability test tool, it is anchored from a similar idea that the developmental approach be aligned with error and contrastive analysis. This is to determine the actual weakness of the respondents among the various aspects of grammar.
A similar grammar testing tool is developed by Liou (1991). Liou created a computer grammar checker that also devotes its primary function with error analysis of writing samples. The errors found in the outputs of the students are categorized into 14 main types and 93 subtypes; mainly, the types used in the thematic analysis are also the word classes as well as their facets. Similarly, the categories of errors are ranked into a taxonomy of mistakes according to their frequency of occurrence and comprehensibility. To do so, a built-in dictionary is installed into the program with 1402 word stems and other features. Grammar rules and patterns are also encoded into the program to recognize a wider scope of possible errors. In using such technique of inputting an electronic dictionary and other error-detecting features, a total of seven error types are unrecognized by the computer grammar checker. Aside from its outdated development, an automated strategy of error checking leaves room for limitations.
Therefore, it is necessitated in this study to formulate a grammatical ability assessment software to assess the current level and achievements of limited English proficient ESL students, specifically students in senior high school. The same endeavor shall be utilized as a diagnostic tool to help in the school’s remediation and placement programs. In the birth of the modern processes and the advances in information and communication technology, a software to achieve the stated purposes is developed through various data gathering procedures and analyses as well as a careful review of the current needs and context of its predetermined respondents.
The development of the grammatical ability assessment software considers the variables resulting from the first stage of this study’s data gathering procedures. It first seeks to determine the grammar facets that students find problematic. Specifically, the research will identify what word classes students misuse, as well as the frequency of these faults.
Simultaneous with the identification of these data is the formulation of how the grammatical ability is to be computed using a table of specifications. The conduct of this method is to contextualize the grammatical ability assessment software to the current level of limited English language proficient learners, specifically in the research’s locale. Once the results of the respondents’ grammatical ability scores are computed and analyzed, these statistics may be exploited in many different practical purposes.
The data may be used as a starting point for English teachers to continue their language development using various approaches to language teaching. Contextualized remediation programs may likewise be developed based on the recommendations of this study. Moreover, the results may also be used as basis for students’ college entrance tests and job application specifically those that are in line with language use. Finally, the
results may be used as an assessment of a school’s performance in language teaching.
Grammatical ability
On his website, Richards (2022) responded to a query about terminologies in grammar. The question asked for a clear distinction among the terms grammatical knowledge, grammatical ability, grammatical competence, and communicative competence. Richards answered that grammatical knowledge and grammatical competence likely refer to the same thing. However, it was added that grammatical ability refers to one’s ability to know how grammar is used in communication. While grammatical competence is merely knowledge on lexis, morphology, semantics, syntax, and morphology, the ability to execute this knowledge in actual conversations is the focus of grammatical ability.
That is supported by Purpura in 2004. In the book, “Assessing Grammar,” Purpura clearly separated grammatical ability from grammatical knowledge, which only covered grammatical forms, semantic and pragmatic meanings, and grammatical knowledge. On another hand, grammatical ability covered topical knowledge, socio-cognitive abilities, and personal attributes. It is the combination of grammatical knowledge and strategic competence; it is specifically defined as the capacity to realize grammatical knowledge accurately and meaningfully in testing or other language-use situations. These two led to the actual use of grammar, which included listening, reading, speaking, and writing abilities.
Neumann (2014) further sustained that grammatical ability encompassed the learners’ ability to use and apply their “theoretical grammatical knowledge accurately and meaningfully in language use situations.” Neumann added that most studies relied on operational use of grammatical accuracy in conjunctions with complexities to make language more meaningful and comprehensive when gauging one’s grammatical ability. It may also be further categorized into the control of grammar (accuracy), and the range of linguistic forms (complexity).
The clarity on the difference among the terms in grammar calls for the focus of this study on the grammatical ability of the students. Putting their grade level into consideration, SHS students have already gone through extensive grammar instruction. Thus, their grammatical ability will serve as the parameter for the conduct of this research.
Form-focused vs. meaning-focused language learning
Stepbystep.com (2022) emphasized that focus on form is still relevant when learning a second language. However, meaning-centered instruction is not elaborated, which originally puts emphasis on the negotiation of meaning. The author of the article disagreed with De la Fuente and posited that language learning is not taught with an explicit form-based approach. Meaning is the basis of learning new words and concept. This opposes the conclusion of De la Fuente’s study, stating that the term form should not be limited to grammar points but shall then include all aspects of second language learning. Therefore, the line that separates form-based from meaning-based is what lacks in that study.
Szopa (2013) mentioned in her study the statement of Clark (2008) that form focused and meaning focused instructions have no such difference. It is also added that none of them is superior to one another. When differentiated language instructional approach is utilized in the classroom, it is found out that there is no significant difference in student performance. Despite language use being the goal of language learning, learners should not lack the formal rules and grammatical structures. It is also mentioned that when learners are not aware of their mistakes in form, they are unable to make self-corrections.
The need to teach and measure grammatical ability
In the light of contemporary educational frameworks, specifically in the teaching of English, it has been forwarded as a trend to focus more on the communicative approach. Various theories in ESL teaching and learning neglect the role of grammar teaching in the holistic development of language among the learners. Also in line with the popular belief, people who are proficient in a language must not only be proficient in the four macro skills (speaking, listening, reading, and writing), but should also be skillful in the use of grammatical competence. Rutherford (1988), as mentioned in the study of Mohamed et al. (2015), posited that grammar is important in language learning and acquisition. It is added in this study that Savage et al. in 2002 emphasized that effective communication cannot be attained without grammatical competence.
Ahangari and Barghi (2012), claimed that the most intricate component of linguistic competence is still knowledge on grammar. However, it is argued in this paper that language test takers reject the significance of linguistic competence and instead, base their competence on the current language that they use. Despite the explicit focus on the use of language based on the learner’s context, it is still undeniable that grammar is the foundation of language acquisition.
In the same paper, it is highlighted that grammar is still central to language description and it is still validated that it is still a significant factor in testing and measuring proficiency. It is observed that language learners perform better in language tests when they realize the importance of grammar knowledge. This leads to them practicing language use out of context. Richards and Renandya (2002) added that “students do not learn English: they learn grammar at the expense of other things that matter as much or more.”
Nonetheless, other literature still pushes for the practical use of language, although it must still be anchored to grammatical competence. The term “grammaring” is forwarded by Larsen-Freeman (2009). The addition of “-ing” to grammar means the active use of grammar knowledge into practice. To realize this endeavor, the understanding of grammatical structures is not enough. Language learners must also find meaningful ways in which to employ grammar. This also means that whatever students learn in terms of grammar must be transferred to actual situations and contexts outside the classroom. Therefore, although there is a call to actively use language into practice, grammatical competence is still significant as the base for language use.
As stated in the book by Hyltenstam and Pienemann (1985) however, grammar knowledge is differentiated from proficiency. It is earlier defined that proficiency in a language entails knowing grammar rules and vocabulary. This can be measured through the learner’s grammatical and lexical knowledge. Grammatical knowledge includes recalling and even applying grammar rules while proficiency is being able to utilize that knowledge in actual language performances. In addition to this, testing language knowledge and its practical proficiency depend on the environment in which the language has been learned or acquired.
Taking note of these evidences that grammatical knowledge remains a significant part in the foundation of language proficiency, it is imperative to teach grammar and include it in the assessment of language acquisition. Furthermore, although grammar knowledge may be separated explicitly from language proficiency, grammatical competency is still present and is measurable in language use outputs, applications,
and utilizations.
Common errors in grammar
Having unconscious knowledge of grammar may be sufficient, even for simple language use. Nonetheless, Debata (2013) implicated that people who wish to communicate in the “artistic manner with well-defined structures” must opt to dig into the greater depth of understanding and proficiency, which is offered by the study of grammar.
The inclusion of grammar topics in SHS subjects is an essential part of language teaching and learning. However, there is a need first to assess the grammatical ability of the learners and identify the parts of speech in English grammar where they need to improve on. In 2020, amigosporvida.com wrote that knowledge of word meaning, structure of sentences, and the parts of the speech play a great role in comprehension and writing. Although pronunciation has a great effect in fluency, it still does not ensure comprehension. This research consequently aims to identify the word classes or parts of speech that language speakers commonly misuse.
Most of the common errors in English that are being cited in various studies focus more on the misuse of two or more words that are similar in nature. For instance, Global Exam (2020) listed 30 common errors and confusions in English. Some of the confusions included in the list are the misuse of “a while” and “awhile”, “advice” and “advise”, “a lot” and “a lot”, “among” and “between”, apostrophes, “assure”, “ensure”, and “insure”, and many more. Although this is true among most language users and learners, the errors gauged here do not focus merely on “grammar”. Instead, the confusions emphasized here are on the interchange between two or more “similar” words. Thus, lapses in vocabulary are the issue here and not on grammatical skill.
The University of Waterloo (2010) has ventured into a similar approach to listing grammar errors. Although it does not fully embrace the idea of word usage confusions, its list mainly features errors on the language’s form. Some common errors included are construction of sentences, use of punctuations, rules in abbreviation and acronym, redundancy, parallelism, and many more. Errors in the use of verbs are also highlighted here, specifically on tenses and structure. Nonetheless, the list likewise does not focus on grammar but instead, in its practical use and the purpose it serves.
Perhaps a more grammatical-centered approach to identifying errors in English focuses more on the commission of mistakes in the different parts of the speech or word classes. According to the English Language Centre (n.d.), there is often confusion between noun, adjective, adverb, and verb forms. Although nouns define or name something and verbs express actions, their forms and structure sometimes share like “record’ and “present”. However, language users often interchange the use of these two in terms of word form. In relation to this, language learners and users, especially in the Philippine setting, commonly commit errors in the subject and verb agreement. In this case, the error is served on the usage of verbs. This is also most particular if the sentence is in the present tense. Similarly, nouns based on verbs can end in either ‘-ing’ (gerund) or another ending such as ‘-tion’ or ‘-ment’. If there is an object in the sentence, the gerund form is usually correct.
Although the use of pronouns looks simple, for instance, the difference between he or she, the difficulty lies on the difference between subject pronouns and object pronouns. The use of personal and possessive pronouns likewise often cause difficulty.
The structure of nouns and adjectives (descriptive words) are often misused in the English grammar; adjectives are used in the sentence instead of supposedly using nouns. The correct use of verb tenses and their various structures, subject-verb agreement, and pronoun usage are also seen as common errors among English writers. (Unacademy.com, 2022).
Most of the times, verbs, instead of being action words, are used as state words. Thus, the verb changes its form to function as an adjective. This happens through the use of the verb’s past participle form.
Having these into considerations, this study will identify the common errors in parts of speech that the respondents often commit. This will serve as basis for the development of the grammatical ability measurement software tool.
Available grammar proficiency tests
Although this research aims to produce basically a grammar proficiency test, it cannot be denied that various grammar tests already existed. These tests have also been proven as effective and functional in terms of gauging learners’ level of grammatical ability. Perhaps the distinction that this particular testing tool is that it aims to be more grammar-centered and more automated with the aid of technology. It also targets to feature a systematic approach to the computation of the proficiency level instead of just basing it on raw test scores and comparing it with the test’s highest possible score. Nonetheless, the various available grammatical ability tests will be cited as a point of comparison and reference.
Ambridge (2012) focused language assessment on production and comprehension paradigms. The approach to language proficiency testing is more experimental and invites language learners to use and produce language in a communicative and practical environment. For instance, it is observed that the learners can duplicate and/or generalize the use of verb forms in other situations outside the verb form in the controlled environment. On another hand, the repetition or elicited imitation tasks the learners to duplicate a sentence that is either simple or complex. It is observed that errors in the reproduction of the sentence, even in the simple ones, are still committed.
This results from the findings that when a learner is asked to duplicate a sentence, they do not copy the words in “verbatim”. Instead, they decode the message and meaning of the sentence and reconstruct it during the reproduction task. The other paradigms explored in that study likewise focus on language production wherein grammatical ability, specifically on the word classes, cannot be systematically measured.
Kitao (1996), meanwhile, offered a list of test types that are fitted for the assessment and measurement of grammar skills. It is firstly emphasized in the study that although testing grammar do not cover the measurement of language usage, it succeeds in testing the production of correct grammar. While multiple-choice type of tests offers learners to choose the correct use of grammar from among the choices, error correction tests features the same nature although it is more on providing a sentence with a grammatical error and correcting it in the test. The same type of test is also utilized in word/sentence order assessments wherein the test takers are asked to choose the correct order of words to finish an incomplete sentence. Similarly, completion tests observe whether the learners can correctly finish a sentence based on their grammar knowledge. Other tests aimed at measuring grammatical ability also include transformation items, word changing items, and sentence combining exercises.
All the rest of grammar tests available in print and online are all more on multiple-choice type of tests. They are more into inviting learners to correct errors, choose the correct answer, and write sentence completion. Therefore, this research aims to produce a contemporary approach to grammatical ability testing. Instead of providing the learners with bits of words and sentences to be corrected, they will be writing their own paragraph. The grammatical ability rate and level of the students will then be based on the errors committed in the word classes used by the test takers in their essay.
Principles of assessment in grammatical ability
Language Testing International (2022) defined language testing as a broad term for testing and assessing a person’s ability to communicate and understand in a particular language. This is done in a variety of purposes. However, in the school set up, language testing is used to assess students’ current abilities and progress for academic placements. General forms of language testing include aptitude tests, diagnostic tests, placement tests, achievement tests, and proficiency tests. Proficiency refers to a person’s competency in using a particular skill. Language proficiency tests assess a person’s practical language skills. Thus, this study focuses on students’ proficiency in English that will be measured through the software tool.
Assessing language capabilities may be done through traditional testing but may not elicit authentic and reliable results. According to Durairajan (2019), language capability assessment may be done through successive assignments, projects, and term papers wherein the students can really think about what they need to write and do. They also need time to revise their responses on their own. In the traditional assessment of language testing, it is assumed that grading is always involved. Durairajan further adds that in language testing, teachers must move from simply evaluating students’ work to experimenting with different types of testing practices. This advocates alternative assessment practices as well. Hence, this study also offers an alternative testing for language proficiency which will focus on authentic outputs rather than the traditional mode of language testing.
Chandio and Jafferi (2016) defined alternative assessments as various modes of assessments which may be employed both for formative and summative assessments. When a student does badly in a formal test, it does not mean that learning is not achieved. Thus, there are alternative ways and means to assess language proficiency. In this study, I will explore non-traditional ways of testing language proficiency. The measurement software tool aims at providing a platform in which the students can fully express their language usage and understanding.
Developing and validating software tools for assessing grammatical ability
Although computers have particularly excelled in the fields of business and communication technology, teachers nowadays also find the use of such technology in teaching. This is due to its huge capabilities and efficiency in the field of education. Moreover, computers are also becoming cheaper and cheaper while also becoming more adaptable and easier to handle. A study by Abu Naba’h in 2012 investigated the impact of computer assisted grammar teaching on the academic performance of English pupils in foreign language. The pupils are grouped accordingly wherein the experimental group are introduced to the computer assisted grammar software. It is then found out that the academic performance of the experimental group is significantly higher than that of the controlled group. This is due to the findings that the exposure to the technology as an assistance to teaching a foreign language is seen as effective.
Contrary to that, the paper of Guerrero et al. (2010) posited that computer-mediated communication mechanism can be detrimental to students’ acquisition of a second language. Using such tools like chat systems does nothing to improve the grammar skills of the students. However, the study developed a software tool that aims at assisting collaborative works in English. The tool intends to aid the teacher in terms of creating, implementing, and monitoring activities in English. It automatically corrects outputs and even provides statistical reports on students’ performance.
Thus, the irrelevance of computer-assisted mechanisms may be seen as irrelevant to teaching grammar, however, it can be very helpful and beneficial in assessing performances in English.
Despite these observations, it is well-known that the properties of software systems frequently change and can be adaptable to the specific needs of users. A software developer must construct a system that they can easily change and modify according to the set needs. As such, a grammatical approach is employed in the development of a software tool in the study of Kosar et al. (2004). The different attributes of grammar are utilized to support the software specification of their tool. The relationships of domain concepts are identified as a context-free grammar. Meaning, the focus only of this software tool is the assessment of grammatical skills. This approach is based on the Grammar Oriented Object Design (GOOD) model wherein the software tool is constructed from the development of use cases and class diagram. These classifications are input into the program, making GOOD an efficient tool to assess grammar.
Specifically, Boguraev et al. (1988) forwarded that syntactic theory can now be facilitated through computational tools. This is despite the findings that a huge part of grammar does not necessarily have to rely on the development of programs. If the objective of the tool is to process natural language, modifications on the nature of the tool have to be executed. This supports the previous literature that a software tool development entails flexibility and adjustability.
During the time of their study, however, upon comparing numerous software systems for grammar development, it is found out that they are not yet enough to cover a wide range of grammar features.
Such development of new software tools, meanwhile, may be validated following the process as described by MD101 Consulting (2016). It is stated that quality is the primary aspect that should be observed. This includes the software’s document management of records, its workflow, and lifecycle activities. Looking at it in a more technical aspect, the generation of errors should likewise be looked at. Its workflow tools should be checked to assess if it would put its users at risk or not. Although software is most specifically being thoroughly validated on medical usage, O’Donnell (2022) generally defines software validation as the process of collecting empirical evidence that ensures the correct development and installation of computer systems. It also assesses whether a software tool is able to meet the needs of its users and if they function according to their intended usage.
O’Donnell further explained that software validation shall include the recording of evidence to prove that the software tool satisfies the specifications and attributes needed for its purpose. In this process, the planning of how to use it shall be included. Ultimately, software tools should not only work according to its user’s commands but must also navigate through other aspects such as its negative impacts and possible misuses.
Thus, the 4Q lifecycle model is forwarded, which includes design qualification, installation qualification, operational qualification, and performance qualification. Each qualification targets how the software tool would function effectively and efficiently in all aspects before, during, and after its intended use. If a software tool can surpass all qualifications, the validation is successful.
Despite the following stories of literature pertaining to the validation of software tools used in the medical field, the process is still deemed appropriate for software validation regardless of purpose. The instruments available for assessing software’s efficacy are adjustable according to the set needs of the developers and users. Ultimately, the goal of securing software validation is to ensure its usability and safety.
Theoretical Framework
According to Mao (2022), Bernard Spolsky in 1978 distinguished three historical periods of modern language testing. These included the pre-scientific period, the psychometric-structuralist period and the integrative-sociolinguistic. The latter period, also known as the psycholinguistic-sociolinguistic testing period (1960s-present), proposed integrative language testing.
This testing put an emphasis on communication, authenticity, and context. John Oller in 1979 made the test easier to score after its criticism on the exclusivity of measurement on knowledge. Oller proposed the unitary competence hypothesis, which meant that tests would pay attention to one’s general language proficiency. It primarily proposed the use of cloze tests and dictation to measure reading comprehension, grammar, vocabulary, as well as language background knowledge.
Magrath in 2015 further posited that designing activities is better done when various language skills are integrated in a realistic fashion. A variety of sources were utilized when constructing integrative tests. These included syntax, vocabulary, “schema”, cultural awareness, reading skills, pronunciation, and grammar. These were all the factors that had to be kept in mind by both the test makers and the test takers. “The integrative test is generally considered to be a more reliable instrument for measuring language competence.”
Thus, in 1965, Noam Chomsky proposed the concepts on linguistic competence (view of language) and linguistic performance (language use). In 1972, Hymes supported this theory by forwarding the communicative theory about communicative competence. This produced a significant relevance to language teaching and testing. It put an emphasis on the idea that knowing language is beyond knowing grammar rules. There were cultural specific rules and other features of communicative contexts that had to be considered.
On the aspect of promoting computer-aided instruction, the Computer Assisted Language Learning (CALL) model serves as an interactive framework for the utilization of software tools in teaching and assessing language learning. Kumar and Sreehari (2011) describe the CALL model as a method of instruction that aids learners achieve their educational goals at their own capabilities and pacing through computer technologies. In this model, computer technology is used to teach and assess all stages of language learning, including feedbacks. Furthermore, CALL is useful in collaborative and cooperative learning and is likewise ideal for carrying out practice drills. Thus, the assessment of grammatical ability through error identification can be aided greatly by following this method.
Lastly, the basis for the assignment of the software’s transmutation weight is the concept of weighted average. It highlights the varying degrees of importance of numbers in a data set are taken into account. Ganti (2023) explains this as the assigning of weights that determine the relative value of each data point in advance. Weighted average is calculated and utilized to neutralize the frequency of each item in a particular data set. To get results using weighted average, the items in a data set is multiplied by the assigned weight, which are then summed. The results reflect the relative significance of each observation, making the average more descriptive. Weighted averaging can also make the data treatment more accurate.
Conceptual Framework
The conceptual framework of this study highlights the designing, development, and test validation of the grammatical ability assessment software. It commences with the identification of the students’ frequency of errors, which would eventually determine the weight distribution to be assigned for each word class. Once these data become available, the software will be developed with grammatical ability test items which are validated by experts in ESL. The test items will be classified into the different levels of the revised Bloom’s taxonomy in the form of a table of specifications. Consequently, both the researcher and the IT collaborator look into bugs and issues that the first version of the software exhibits. Once fixed, finalized, and test validated, the software tool undergoes evaluation from three significant groups: English teachers as the administrators-to-be of the software, SHS students as actual users and takers of the grammatical ability assessment, and IT specialist as experts on software development and validation.
Figure 1. Conceptual Framework of the Study
Research Objectives
This research aims to develop a grammatical ability assessment software for ESL students with limited English proficiency. Specifically, this research targets to provide answers for the following objectives:
To determine the grammatical features that the students frequently misuse in a grammar test as basis for the development of the taxonomy of errors.
Scope and Limitations
Grammar ability deals with the capability of the students to utilize and understand the English language effectively. This entails the notion of having minimal errors in written outputs, particularly in the parts of speech. Students’ mastery of the language will be measured on the comparison and contrast of their errors on parts of speech with the total number of words in their written outputs.
However, despite parts of speech having many different sub categories, interjections or expressions of strong emotions, informal remarks, and/or abrupt exclamations will not be included in this study.
When it comes to the features of the software, it highlights an automated checking through the teacher’s answer key input. This feature limits the researcher to develop a grammatical ability assessment that requires test takers to input sentences or expressions. Open-ended test items would require numerous, if not indefinite, possible correct answers for consideration. And so, this intent has to be sacrificed. Instead, the test items developed are ensured to possess a high-level approach as guided by the higher-order of thinking skills in the revised Bloom’s taxonomy. The software will only be developed for offline purposes, eliminating the production of an online version of the research project which would serve more purposes but would invite more labor for the IT collaborator.
The Grade 11 senior high school students are the subject of this study. The grammatical ability assessment software tool would be most useful to them for various academic purposes. This tool will be used for their diagnosis which will result to the recommendation of a remediation program in English. Lyceum of Alabang Inc. is the locale in which the study will be conducted. The respondents will only be selected from the bona fide students of the school who are currently enrolled in the academic year 2022-2023.
Lyceum of Alabang, recently renamed Lyceum of Alabang Inc., is an educational institution located at Km. 30 National Road, Brgy. Tunasan, Muntinlupa City. It caters to students from nursery to tertiary. The school is also where the researcher is currently employed as a senior high school faculty member, thus, making it a convenient and operative locale for the conduct of this study. Founded in October 2003, Lyceum of Alabang Inc. started as a premier institution of the south and is dedicated to identifying, preserving, and promoting public awareness in Science and Technology.
Originally in Putatan, the new building of Lyceum of Alabang Inc. transferred to Tunasan in 2012. It became the first private higher educational institution (PHEI) in Muntinlupa to be certified ISO 9001:2015 Quality Management System. In senior high school, it offers two tracks (Academic and TVL) and seven strands (ABM, GAS, HE, HUMSS, ICT, IA, and STEM).
Definition of Terms
This study utilizes terminologies which may either be familiar or unfamiliar to readers. This section provides a list of terms that are frequently used in this research together with the conceptual and operational definitions of each. This section aims to aid the readers become more familiarized the discussion incorporated in this paper.
ESL. English as a second language. This refers to the use of English for non-native speakers like Filipinos who are only using the language as secondary to their first language or mother tongue.
ELL. English language learners. In this study, they are also the respondents.
Grammatical ability. The combination of grammatical knowledge and strategic competence; it is specifically defined as the capacity to realize grammatical knowledge accurately and meaningfully in testing or other language-use situations.
Lyceum of Alabang (LOA). An educational institution located in Muntinlupa. It serves as the locale for the conduct of this research. Particularly, data gathering procedures focus on its senior high school students.
Measurement. It refers to the process by which the attributes or dimensions of some physical object are determined. When used in the context of learning, it would refer to applying a standard scale or measure device to an object, series of objects, events or conditions, according to practices accepted by those who are skilled in the use of the device or scale (Kumar and Kumar Rout, 2016).
Proficiency. It is generally recognized that the concept of proficiency in a second or foreign language comprises the aspects of being able to do something with the language (‘knowing how’) as well as knowing about it (‘knowing what’). Accordingly, language proficiency encompasses a language learner’s or user’s communicative abilities, knowledge systems, and skills (Harsch, 2017).
Software tool. Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it. It is any set of instructions that guides the hardware and tells it how to accomplish each task (CSCA0101 Computing Basics, 2019).
Word classes. This is the contemporary term that is used in replacement for parts of speech and will be utilized in this paper. In the English grammar, there are eight parts of speech: nouns, pronouns, adjectives, verbs, adverbs, conjunctions, prepositions, and interjections. The articles “a”, “an”, and “the” would fall under the category of adjectives.
Research Design
In order for the development of a software to have a dynamic structuring, the Kemp Model is utilized to develop and validate the assessment software of this study. Also known as the “Morrison, Ros, and Kemp Model”, this aims to offer an insight into the potential advantages of utilizing such framework. Kurt (2016) describes the Kemp Model as an innovative approach to instructional design with interconnected components.
Contrary to the ADDIE Model, Kemp Model is an interactive cycle that involves planning, revision, formative evaluation, and project management. Consequently, these four parts of the process further involves micro steps that are intertwined. The development of the software considers instructional problems, learners characteristics, tasks, instructional objectives, content sequencing, instructional strategies, message design, pedagogies, and evaluation instruments, which are not looked into one after another.
They are all simultaneously working before, during, and after the development of the software. Thus, such model is appropriate for software targeted at learning assessment, particularly the grammatical ability assessment tool as forwarded by this study.
Figure 2. Kemp Model as Utilized in the Study
As per the structural design of the Kemp model, the initial phase of the software’s development will ensure that instructional problems, learner characteristics, message design, evaluation instruments, and instructional delivery. These will all be determined through the initial grammar testing that is aimed at identifying the grammatical features that the students frequently misuse. This step will simultaneously provide information on students’ needs as well as the difficulties that they encounter in computer-assisted instructions and assessment. Consequently, the content sequencing, task analysis, and instructional objectives will materialize out of this data gathering procedure. These will all lead to the commencement of the actual writing and development of the grammatical ability assessment software. It will stem from a planning phase wherein the research will lay out its development procedures to the IT collaborator. He will also provide the researcher with a user guide that will serve as the tool’s support service. The accomplishment of the initial version of the software will lead to the conduct of a summative and formative assessment. The result of the tool evaluation will give the researcher and the IT collaborator grounds for its revision and further improvement. The evaluation process will then be conducted among the teachers and students as well as other IT specialists as software validators. Their assessment of the software will focus on its usability, maintainability, and sustainability.
The measurement of grammatical ability is a very broad undertaking. In various related literature cited in this chapter, grammar errors that may be counted may include misuse of words due to their similarities in spelling or construction. Other forms of testing grammar skills also only observe whether the language learners can continue, reproduce, duplicate, or reconstruct language evidences. In this particular study, the respondents will be tasked to distinguish the grammatically correct expression among a set of similar sentences to fully gauge their ability to use and execute their grammar knowledge. In terms of grammar mechanics, each respondent must also observe the correct use of punctuation markings, capitalizations, pluralization, and spelling. Errors on mechanics may affect the correctness of word class usages across different sentences. Therefore, errors on these rules will be considered as “mistakes” and will be counted in the grammatical ability measurement software. For instance, if a respondent writes a proper noun in the lowercase, that particular word will be counted as a noun usage error. Errors committed on the use of punctuation marks will also be regarded even though they are not part of the word classes because the correct distinctions between commas, periods, question marks, and even exclamation points must at least be intrinsic within the grammatical knowledge of the ESL students.
After getting results from the first data gathering regarding the frequent word class errors of the respondents, it will serve as the basis for the researcher to designate the weight distribution for the transmutation of each word class error. This will then lead to the identification of their grammatical ability level according to the descriptors “Beginner”, “Intermediate”, “High”, “Advanced”, and “Advanced High”.
Research Locale
The study is conducted at Lyceum of Alabang, an educational institution located at Km. 30, National Road, Brgy. Tunasan, Muntinlupa City. It caters to students from the pre-elementary until the tertiary level. Specifically in senior high school, the K to 12 curriculum mandates the teaching of English courses that do not dwell much on grammar. These subjects, in compliance with the skills necessary for this level, employ language applications both in speaking and in writing. Subjects like Oral Communication in Context, Reading and Writing, 21st Century Literature, English for Academic Purposes, and even Practical Research I and II dwell more into the academic and technical usage of English as medium. These core and specialized subjects are being taken by all SHS students regardless of their strands. Therefore, the relevance of English to these courses are highly manifested. However, discussions on sentence construction, forms of words, and other grammatical concepts are no longer emphasized. This can also be attested by the lack of grammatical topics in the curriculum guide and in the list of the most essential competencies (MELCs) of these subjects as provided by the Department of Education.
Despite the expectation that students in Grades 11 and 12 are already possess confidence and command of the language, it is still undeniable that the students in the SHS level still have deficiencies in their grammatical ability. Thus, to cope with the competencies required in the above-mentioned subjects offered at Lyceum of Alabang, it is necessitated to develop a grammatical ability measurement software. The findings that will result from the initial testing of this tool will aid the English area of LOA, particularly with their remediation programs.
Sampling and Participants
The strand from where the respondents belong does not have to be predefined since the data collection is administered to students belonging to the Grade 11 level. Thus, for this particular study, 15 Grade 11 students from each seven strand at LOA are selected using a criterion-based sampling technique, totaling to 105 samples. The students are taken as samples using a purposive selection method. In this sampling technique, the samples are chosen according to their academic performance in the core subjects Reading and Writing and Oral Communication in Context (Cohen et al., 2007). In this particular study, the SHS students who get an average of 79 and below during the first quarter grading period will be considered as respondents with limited English. The subjects to be investigated are the English writing subjects under the K-12 curriculum, namely Reading & Writing and English for Academic and Professional Purposes. This sampling criterion is derived from the leveling of proficiency as mandated by the Department of Education. As stipulated under DepEd Order No. 31, the five levels of proficiency are identified as beginning, developing, approaching proficiency, proficient, and advanced. In this study, the students who will be considered as students with limited English are those who would fall under the beginning to developing level. The developing level constitutes grade averages ranging from 75 to 79.
However, an equal number of respondents will be sampled from all seven strands. The decision to equally distribute the samples from all seven strands is for the attainment of a variety of text evidences regardless of their interest, educational background, and areas of specializations.
Lyceum of Alabang offers seven strands in the senior high school, which are divided into two tracks, the Academic Track and the Technological-Vocational Livelihood (TVL) Track. Under the Academic Track are the Accountancy and Business Management (ABM), General Academic Strand (GAS), Humanities and Social Sciences (HUMSS), and Science, Technology, Engineering, and Mathematics (STEM) strands. Under the TVL Track are the Home Economics (HE), Industrial Arts (IA), and Information and Communication Technology (ICT) strands. In summary, there are 1,416 Grad 11 students in senior high school in the current academic year, 2022-2023.
Once the participants have been selected, the instrument tool will be provided and discussed. The tool will include both the general and specific objectives of the research for their perusal. The participants will be given opportunities to clarify certain parts of the research conduct that they find unclear. Likewise, the results and the data analysis will be returned to the participants upon their request.
On another hand, the same introduction of the objectives will be conducted among the second groups of participants. These include a group of English teachers from the locale through a purposive sampling, SHS students selected using an availability sampling method, and IT specialists via a convenience sampling technique. The nature of the software tool will be explained in detail together with its purpose and usage for them to have a clear knowledge on how to approach the software’s feedback.
Instruments
In the actual data gathering procedure, the students will answer a digital test provided by the researcher. The test tool that will be used for the measurement of the participants’ grammatical ability will be adapted from existing conversational grammar activities. The test tool will cover the various aspects of grammar such as the correct use of the different word classes. Moreover, the participants will also be tasked to construct or rephrase sentences to measure the conversational application of their grammatical ability. When the measurement software is already developed, the respondents will input their answers directly in the software.
The questions that measure grammatical ability will be hardcoded to the software together with the answer key for automated checking. Through the validation of the instrument, the content of the grammar questions will be checked in terms of the appropriateness of the instructions, the attainment of their intended assessment outcomes, and the correctness of the questions on the word classes each level aim to assess. Likewise, other ethical considerations will be checked in the instrument to avoid the inclusion of exclusivity and inappropriateness of questions.
In the development of the software, an IT Specialist with whom I will collaborate with will utilize the use of the Windows Form Application (C#) Visual Studio 2022. Once developed, the software will be extracted as a portable application which can be installed in laptops, desktop computers, or even flash drives.
Finally, a questionnaire in the form of a Likert scale will be developed and will be administered among the English teachers, SHS students, and IT specialists who will also serve as the users of the software tool. This form will determine the reaction and comments of the tool users towards the grammatical ability measurement software tool.
Data Gathering Procedure
Since the researcher is no longer professionally connected with the locale, a letter of permission for the conduct of the study will be addressed to the vice president for basic education. The letter will also be addressed secondarily to the SHS principal, with a copy furnished for the area head of English. Once the request to conduct the study and the dates for actual data gathering have been approved, I will proceed with the testing of the software tool among the target participants. The data gathering will be done onsite.
In their subject, Reading and Writing (one of the English subjects offered to Grade 11 in the Second Semester), the respondents submit book reviews as part of their requirements. Their book reviews are written by part where the students are tasked to synthesize what the story is about according to chapters. For this study, I handpick one chapter from the outputs of each student. These selected chapters are encoded unabridged and serve as basis for data gathering in an error analysis. However, it is ensured that the students are well informed regarding the instructions specified in the scope and limitations. The collection, input, and analysis of discourse data constitute an account of the phenomena that the research targets (Cohen et al., 2007).
Pretesting Phase
In each phase of the development of the grammatical ability measurement software tool, I will review the tool and the developer will revise the codes according to my comments. After the first try-out, the feedback of the English teachers who will serve as tool users will be solicited. A survey form adapted from the software evaluation guide by Jackson et al. (2021) will be provided. Since the software evaluation form provided on their site is multiple, depending on who will evaluate the software, the specific evaluation tool will be pre-selected. Specifically, the evaluation form for user-developer will be utilized when soliciting the English teachers’ feedback since the their feedback will contribute to the development and improvement of the output.
Development Phase
For the initial stage of data gathering, the software tool’s prototype is developed using the Windows Form Application (C#) Visual Studio 2022. The design, features, and functionality of the grammatical ability measurement software tool are discussed with the IT expert tasked to develop the software. These include its basic aesthetics, the buttons and their functions, as well as the computations needed to be inputted in the language program in order to achieve the desired results.
The software tool undergoes three phases of debugging in which double checking is done in every phase. Code errors in each phase are corrected and modified according to my comments and another senior IT expert’s. After three successful debugging phases, the prototype is now ready for its first try-out.
Following the actual writing of the software tool, a table of specification (TOS) will be developed. The TOS will be structured following the revised Bloom’s taxonomy. The pool of items to be included in the actual software tool will be based on the demands of each tier in the cognitive taxonomy. The questions are intended to measure the critical thinking skills of its users according to each level. Ten items will be written per level, summing up to 60 items that the students need to accomplish. Language teachers will serve as the experts who will validate the test items to be included in the software. Their validity towards each tier in the taxonomy will be checked as well as their reliability in presenting scores and measuring proficiency.
To aid the researcher in classifying each item in the grammatical ability test, the specific verbs applicable for each level of the revised Bloom’s taxonomy is utilized. The collection of appropriate verbs is as employed in Table 1.
Table 1. Revised Bloom’s Taxonomy Verbs
Remember | Understand | Apply | Analyze | Evaluate | Create |
Choose | Classify | Apply | Analyze | Agree | Adapt |
Define | Compare | Build | Assume | Appraise | Build |
Find | Contrast | Choose | Categorize | Assess | Change |
How | Demonstrate | Construct | Classify | Award | Choose |
Label | Explain | Develop | Compare | Choose | Combine |
List | Extend | Experiment with | Conclusion | Compare | Compile |
Match | Illustrate | Identify | Contrast | Conclude | Compose |
Name | Infer | Interview | Discover | Criteria | Construct |
Omit | Interpret | Make use of | Dissect | Criticize | Create |
Recall | Outline | Model | Distinguish | Decide | Delete |
Relate | Relate | Organize | Divide | Deduct | Design |
Select | Rephrase | Plan | Examine | Defend | Develop |
Show | Show | Select | Function | Determine | Discuss |
Spell | Summarize | Solve | Inference | Disprove | Elaborate |
Tell | Translate | Utilize | Inspect | Estimate | Estimate |
What | List | Evaluate | Formulate | ||
When | Motive | Explain | Happen | ||
Where | Relationships | Importance | Imagine | ||
Which | Simplify | Influence | Improve | ||
Who | Survey | Interpret | Invent | ||
Why | Take part in | Judge | Make up | ||
Test for | Justify | Maximize | |||
Theme | Mark | Minimize | |||
Measure | Modify | ||||
Opinion | Original | ||||
Perceive | Originate | ||||
Prioritize | Plan | ||||
Prove | Predict | ||||
Rate | Propose | ||||
Recommend | Solution | ||||
Rule on | Solve | ||||
Select | Suppose | ||||
Support | Test | ||||
Value | Theory |
Source: Handayani, A. (2019, November). HOTS-based assessment: the story of English teacher’s knowledge, beliefs, and practices. Jurnal Bahasa Lingua Scientia, 11. https://doi.org/10.21274/ls.2019.11.2.273-290
The classification of the test items according to the levels of the revised Bloom’s taxonomy is as illustrated in the table below. Despite the test being a multiple-choice type of assessment, the manner of taking the test requires the students to utilize their application, analysis, and evaluative skills.
Likewise, each level in the taxonomy will be assigned with a predetermined weight for the computation of their transmutation. The weight distribution will be determined from the test results of the software’s pilot testing. The frequency of errors of the students with respect to each level of Bloom’s revised taxonomy will be the consideration when writing the transmutation formulas. The weight distribution will be included in the TOS.
Once the TOS and the pool of questions have been written, they will be encoded in the software for execution and testing. The grammatical ability levels must be reflected on screen immediately after the responses of a student are recorded by the software.
Through the test items developed for the software tool, the taxonomy of errors in grammar is identified. There are 10 questions asked per word class for a total of 70 items (excluding interjections) and each word class are labeled as levels 1 to 7. After garnering their answers and comparing the number of their errors and correct answers, the percentage of error per level is defined. The taxonomy of errors is created after knowing which word classes are frequently answered correctly and incorrectly by the respondents. Once their errors have been averaged and the percentages have been manifested, each word class is assigned with a description of their frequency level. Below is the table illustrating the basis for the assignment of frequency descriptors according to the percentage error of each word class.
Table 2. Basis for the Frequency or Error Descriptors
Percentage Error | Frequency of errors |
100% | Always |
80% to 99% | Most Frequent |
60% to 79% | Frequent |
40% to 59% | Sometimes |
20% to 39% | Rarely |
1% to 19% | Most Rarely |
0% | Never |
The table shows a 7-point Likert scale on frequencies. “Always” has been assigned with a 100% percentage of error while 0 is assigned to “Never”. The rest of the scale in between is assigned with a range of percentages that are derived from the equal dividends of all the levels. Once this is established, the frequency of error per word class may now be assigned with descriptors. The data are computed from the total number of respondents, which is 100, and the total number of test items, which is 10 per word class (70 items all in all). Below is the table showing the percentage error of the students as well as the descriptors of the frequency of errors assigned to each word class.
The second trial of the grammatical ability measurement tool will be executed once the prototype of the software tool is fully developed and improved. All the comments and suggestions by both the researcher and the thesis adviser will now be applied in order to come up with the final version of the software’s prototype prior to the oral presentation. Likewise, the grammatical ability measurement software tool will now be conducted among the specified number of participants for data gathering.
For the transmutation table of the word classes to be finalized, the average error of the actual number of participants will now be ultimately considered.
The table of specification for the test items to be included in the software tool will be based from the test items that are to be validated by ESL experts. This TOS will likewise be utilized to determine the weight distribution depending on the participants’ average frequency of errors. Interjections, being expression words, are not included in the list of word classes which are tested in this study.
Using a basic percentage formula, the average number of word errors is computed from the total number of words in the document. The distribution of weight per word class is based from the identified word classes which are commonly misused. This weight distribution prescribes how much the frequency of errors is multiplied in terms of word classes.
Total Word Errors=Summation of Word class errors x weight per word class
The average word errors in the outputs are to be computed out of the total number of words used by the respondents. The tallies for each word class error will be transmuted using the predetermined computation with the assigned weight distribution for each type of word class. Consequently, the result is deducted from the perfect score of a hundred percent grammatical ability, which then results to the respondents’ average error.
Average Error= (Word Errors)/(Total Number of Test Items) x 100%
grammatical ability Score = 100% – Average Error
In line with this, the following facets of word classes that get a high frequency count of misuses are given a smaller weight in the computation of grammatical ability. Likewise, the word classes that get a low frequency count of correct usage are given a higher weight in the computation of grammatical ability. The transmutation weight depends on how frequent the students commit mistakes on each word class in a negative correlation. The value of the weight decreases as the frequency of errors increases. Vice-versa, when the frequency of errors drops, the transmutation weight increases. Below is the table showing the basis for the assignment of transmutation weight that is negatively correlated to the frequency of errors. It will serve as the baseline when assigning weight for each grammar ability category.
Table 3. Specification of Weight Distribution
Frequency of errors | Transmutation weight |
Always | 0 |
Most Frequent | 0.5 |
Frequent | 1 |
Sometimes | 1.5 |
Rarely | 2 |
Most Rarely | 2.5 |
Never | 3 |
This transmutation of word errors will still change depending on the results of the data gathering. After distributing the assigned weight to the word classes, the frequency of errors per word class of the respondents are computed to get their proficiency level. The average number of grammar errors is computed from the total equivalent of the frequency of errors. After subtracting it from 100, this results to the grammatical ability of the respondents.
As the basis framework of this study, the concept of weighted average will be applied in the assignment of weight transmutation. The values will be determined from the students’ taxonomy of errors per word class.
After identifying the grammatical ability score of the respondents, a scale is developed based from five grammatical ability levels. A wider range is given to level 1 where the respondents are not likely to be listed under. The distribution narrows down from levels 2 to 5. Words such as Beginner, Intermediate, High, Advanced, and Advanced High are used as descriptors for each of the levels respectively. The said scale is as shown in table 4.
Table 4. Grammar Proficiency Levels and their Descriptors
Level | Scale | Range | Descriptors |
1 | 1-35% | 35 | Beginner |
2 | 35.1-60% | 25 | Intermediate |
3 | 60.1-75% | 15 | High |
4 | 75.1-90% | 15 | Advanced |
5 | 90.1-100% | 10 | Advanced High |
Validation Phase
The data gathering from the student participants will cover the answering of the grammatical ability test and the showing of results. The accomplishment of this phase will introduce the research to the next step in the process, which is to garner feedback from language experts. All the English teachers in Lyceum of Alabang will be purposively selected to use, observe, evaluate, and criticize the software.
A self-made evaluation tool will be provided to the teachers to determine their initial and expert reaction towards the tool. Their responses in the tool will be summed up, summarized, and analyzed in order to generate concrete recommendations for the software tool’s improvement. These collated comments will be communicated to the IT expert who has developed the software during the developmental phase. The coding of the tool will be modified and adjusted according to the overall comments of the language experts. After the tool’s modification, the software tool will now be considered as the final version and will then be used among the participants for second testing. The results of the second testing will now be considered as the final findings of the study. The final findings will be used as grounds for the recommendations of this research, particularly in the designing of appropriate remediation programs among the SHS students with limited English.
On June 26, 2023, the grammatical ability measurement software is presented to the senior high school English/Language teachers of the research locale for evaluation and feedback discussion. Eight faculty members are present in the data gathering activity through an availability sampling out of a total of 12 teachers in the target population. I present the objectives of the study as a backgrounder and discuss the specific purposes of the software. The teachers are also informed about the results of the first data gathering among the students, which leads to the distribution of the transmutation weight for each word class. Afterwards, the actual software tool is demonstrated to the teachers for their evaluation, questions, comment, and recommendation.
Consequently, questionnaire forms are handed out to the teachers for their evaluation of the software. The questionnaire is composed of two parts focusing on the software’s usability and sustainability/maintainability. A 4-point Likert scale accompanies the questionnaire, with numbers 1 to 4 representing the descriptions “Strongly Disagree,” “Disagree,” “Agree,” and “Strongly Agree” respectively. To analyze the data sets coming from the feedback of all three groups of tool evaluators, the Wilcoxon Signed Rank Test is utilized. It is a nonparametric statistical tool that produces inferential analysis through comparing the median scores with the Likert scores in the questionnaire tool.
To inferentially determine where the Likert responses to a single question by several respondents to a questionnaire item tend to gravitate, a series of Wilcoxon Rank Tests were performed. The steps are outlined in Figure 3 below:
Figure 3. A Flowchart of a Series of Wilcoxon Rank Tests to Determine the Inferential Median of Several Four-Point Likert Responses to a Survey Question
At first, the null hypothesis H0 that the median was equal to 1 (equivalent to “Strongly Disagree”) was tested against the alternative hypothesis Ha that the median was greater than 1. If the p-value was greater than or equal to α, which is chosen to be 0.05, then H0 was not rejected and it was concluded that the median was equal to 1 and a conclusion was reached: the respondents tended to strongly disagree to that particular survey questionnaire item. If the p-value was less than α, then H0 was rejected and another Wilcoxon Rank Test was conducted, this time for the H0 that the median was equal to 2 against the Ha that the median was greater than 2. The process was repeated until a conclusion was reached with less than 100α % probability of committing a Type I error.
The use of the non-parametric Wilcoxon Rank Test is justified since Likert data is ordinal in nature and since the equality of the interval between two points cannot be ascertained, normality of the distribution cannot be expected.
Ethical Issues
Social value
The grammatical ability measurement software tool will serve as grounds to measure one’s grammatical ability authentically, validly, and reliably. As compared to other existing grammar tests, the product of this research endeavor aims at providing accurate measurement and reporting for various purposes. These would include placement of learners, admission basis, report, and grades. This tool may likewise be utilized as a rubric for grading criterion on grammar correctness in essays and other forms of prose outputs particularly in formal and business writing. Although this software tool is reliant on the quantifying of grammar errors in a written output, the reflection of these errors may also serve as a learning platform for the learners to improve on their weakness in writing.
Informed consent
Upon the issuance of instruments, the respondents will be extensively introduced to the framework of the research. Although their demographic profile will be collected as preliminary requirements in the instrument (with the writing of their names as optional), their identities as respondents will be treated with utmost confidentiality and solely for the purpose of this research. Their consent will also be solicited as to whether they agree to have their essays included in the study’s appendices or not.
Vulnerability of the research participants
Also, since the respondents are Grade 11 students, they are also recent finishers of junior high school (Grades 7 to 10). Thus, their grammatical ability score may reflect the kind of input that they have had during their formative years in education, maybe even rooting back to their elementary education. Indeed, language development requires a strong and meaningful foundation, and is not something that is easily remedied once the learner is already in senior high school. In relation to these, the reality that the respondents come from different geographical locations is also considered. Some respondents may have come from the province or other far-flung areas where education and language teaching may not be as evident and meaningful. If these respondents get below average grammatical ability scores, they may think that it is due to regional differences and may experience self-discrimination and anxiety. Thus, it is also important that I or the test proctor gives careful instructions and explanations regarding the test’s purpose and implications.
Lyceum of Alabang also has a distinction between two kinds of students in senior high school. First are the regular students who pay the actual tuition fee of the school (private sections) and second are the students subsidized by the government through their voucher program (public sections). Since the method employed in sampling respondents from the population is randomized selections, a combination of students from the private sections and public sections is unavoidable. Therefore, grammatical ability results might also be an issue regarding this distinction.
Risk and benefits
Measuring grammatical ability also has its implications both on the respondents as well as their educational background. A lot of students neglect the notion of grammar’s importance in their education without considering the fact that the English language is utilized almost anywhere and anytime. Therefore, getting low grammatical ability marks may have a negative effect on their cognition and might further discourage them to learn the language more. Especially the fact that the respondents are already in Grade 11, getting low proficiency scores in grammar might seem like an unusual or an embarrassing phenomenon.
Privacy and confidentiality
The respondents in this research’s undertakings will be anonymously labeled in each text evidence. In each essay that will be entered in the software tool, the owner will simply be titled with numbers (e.g., Student 1). The intended outcomes of the data gathering procedure need not require for the identification of the respondents. However, their identification will later on be utilized once the results are produced if and when a respondent wishes to determine his or her grammatical ability level. The frequency of errors in the essays may also be reported to the respondents after the research process for informative purposes. Nonetheless, the inclusion of their names and other identification details will not be included in the publication of research.
Transparency
After the essays of the respondents have been processed using the grammatical ability measurement software tool, the details as to how the software has arrived with their grammatical ability level will be reflected extensively. The frequency of error that the respondents commit for each of the different word classes will likewise be displayed, together with the average errors they have garnered in relation to the total number of words in their essays. A results’ page will be displayed by the software tool after its computation process revealing every detail needed for the transparency and information of a respondent’s grammatical ability.
Qualifications of the researcher
The researcher is a graduate of Bachelor in Secondary Education Major in English and a licensed professional teacher since 2016. He hase been in the teaching profession for more than six years including experiences in handling administrative positions such as heading the English cluster and advising a student publication. The aspect of grammar in English as a second language for senior high school students as field of expertise serves as my qualification to venture into the research topic of developing a grammatical ability measurement software tool. With the help of an IT developer, the accuracy of the research outcome will be ensured.
Adequacy of facilities
Lyceum of Alabang is abundant with IT experts and programmers. The school offers programs related to IT and programming, both in senior high school and college. The school also has reputable facilities in the field of IT, specifically computer laboratories. The program developer of this research will have all the resources needed for the development of the grammatical ability measurement software tool.
On another hand, the abundance of students in the senior high school will also enable the study to garner ample amount of data needed for the study’s endeavor. Lyceum of Alabang offers English-related courses in every semester, enabling the research to collect data through the writing of essays. This task may be executed coinciding with their subjects’ requirement and in compliance to the outcomes-based education curriculum that the school abides to.
Community involvement
The English faculty members in the locale will have a pivotal contribution to the conduct of this research as they are the pilot users of the software tool. Their expert criticism after the pilot testing will highly influence the finalization of the software tool’s source code as well as its other features. The administrative leg of the school, specifically the admissions office, may also take a look at the research outcome and recommend its suitability to their placement and admission function.
Taxonomy of errors
After the conduct of the grammatical ability test, the taxonomy of errors has been identified. The number of correct items per word class is subtracted from the total number of items per level, resulting to the total number of errors of each student. The number of students in each frequency level for each word class is as tabulated below.
Table 5. The Number of Students in Each Frequency of Errors for Each Word Class
Word Classes | Never | Most Rarely | Rarely | Sometimes | Frequent | Most Frequent | Always |
Noun | 10 | 19 | 17 | 7 | 7 | 17 | 23 |
Pronoun | 16 | 27 | 11 | 6 | 3 | 29 | 8 |
Adjective | 16 | 22 | 11 | 7 | 0 | 4 | 40 |
Verb | 3 | 6 | 16 | 16 | 14 | 8 | 37 |
Adverb | 3 | 25 | 12 | 13 | 4 | 4 | 39 |
Preposition | 10 | 19 | 13 | 13 | 6 | 1 | 38 |
Conjunction | 5 | 15 | 21 | 9 | 23 | 10 | 17 |
The students who answered the grammatical test are 100 in total. Their frequency of errors in each word class are counted and labeled according to their frequency level.
Table 5 shows that the numbers of errors in each word class are scattered across different frequency levels. This demonstrates that the correct answers of the students and the items that they find difficult do not possess any particular similarities.
After the scores have been counted, they are subtracted from the total number of items in order to get the percentage of the students’ frequency of errors. They are then labeled with the frequency of error descriptors found in Table 1. The percentages of error frequency per word class are as shown in the table below.
Table 6. Frequency of Grammatical Features that the Students Misuse
Word classes | Number of correct answers | Number of errors | Percentage of error | Frequency of error |
Noun | 448 | 552 | 55.2% | Sometimes |
Pronoun | 528 | 472 | 47.2% | Sometimes |
Adjective | 459 | 541 | 54.1% | Sometimes |
Verb | 299 | 701 | 70.1% | Frequent |
Adverb | 389 | 611 | 61.1% | Frequent |
Preposition | 425 | 575 | 57.5% | Sometimes |
Conjunction | 417 | 583 | 58.3% | Sometimes |
The percentage of error per word class is derived from the percentage of the average of error from the total number of test items, which is 70. Considering that there are 10 items per word class and there are 100 respondents, the overall total number of test items is 1,000. The students commit mistakes on nouns sometimes with a percentage error of 55.2. With a 47.2% error, students’ frequency of error on pronouns is considered as sometimes as well. Students also have errors sometimes on adjectives with a percentage error of 54.1. The data gathering also resulted to verb emerging as the word class that the students mostly commit mistakes on. With a percentage error of 70.1, it is assigned with a “frequent” descriptor. Students likewise frequently find adverbs difficult with a frequency of error of 61.1%. The rest of the word classes, namely preposition and conjunction, all get a error frequency of “sometimes”. Respectively, their percentages of errors are 57.5 and 58.3.
Similar to the information provided by the English Language Centre (n.d.), people usually encountered confusions with the use of nouns, adjectives, adverbs and verbs. For instance, even though nouns and verbs are two very different word classes, their forms are used interchangeably by the language users. Similarly, the results of the current study show that verbs and adverbs are the word classes that students most frequently commit errors on. These are backed up by Unacademy.com in 2022 when they stated that the structure of noun and adjectives are also commonly misused. However, the results of the current study reveal that student only have confusions in nouns sometimes and in adjectives rarely.
Weight distribution
Consequent to the establishment of each word class’ frequency of error, the transmutation weight may now be assigned to each level. Revisiting the frequency of errors assigned to each word class, the transmutation weight may now likewise be distributed. The word class(es) that the students most likely to commit error on will be assigned with a lower value of transmutation weight. On the other hand, a higher value of transmutation weight will be given to the word class(es) that the students commit errors on the least. The table below shows the transmutation weight distribution that is assigned to each word class.
Table 7. Transmutation Weight Distribution to each Word Class
Word class | Frequency of error | Transmutation weight |
Noun | Sometimes | 1.5 |
Pronoun | Sometimes | 1.5 |
Adjective | Sometimes | 1.5 |
Verb | Frequent | 1 |
Adverb | Frequent | 1 |
Preposition | Sometimes | 1.5 |
Conjunction | Sometimes | 1.5 |
Based on the transmutation weight distribution listed in Table 7, the following values are assigned to each word class: A transmutation weight value of 1.5 is assigned to nouns after sometimes being committed mistakes on by the students. The same goes with pronouns and adjectives exhibiting a frequency of errors of “sometimes” by the respondents. Since verbs and adverbs get the most number of errors in the grammatical ability test, it is being assigned with the lowest transmutation weight of 1. Prepositions and conjunctions are assigned with a transmutation weight of 1.5 after being considered as word classes students sometimes commit mistakes on.
Now that all of the values have been assigned for the word classes, the code for the grammatical ability software is now ready for writing. The automated computation as one of the software’s features will be based from the values determined from the above-mentioned results.
Results of the content validity of the developed writing assessment test
The development of the grammatical ability software is preceded by the writing and validation of the test items. The objectives of the test items are to provide a valid and reliable assessment of grammatical ability and language proficiency as well as to hit the higher levels of the revised Bloom’s taxonomy. In order to satisfy the need to have test items with validity on assessing and measuring one’s grammatical competence, the researcher constructed a set of three expressions in English that are both similar in construction and in highlighting the word class focused on in each part of the test. The instruction is for the student takers to distinguish the expression that is grammatically incorrect in terms of syntax and actual use. Such task encompasses the application of grammatical knowledge and the analysis and evaluation of the sentences’ appropriateness to the context of each item. Meanwhile, below is the table showing the table of specification developed based on the approved test items and as anchored with the revised taxonomy of Bloom.
Table 8. Table of Specification for Test Items Based on the Revised Bloom’s Taxonomy
Verbs | Test items |
Remembering | 7, 10, 11, 13, 16, 21 |
Understanding | 1, 2, 3, 5, 12, 14, 15 |
Applying | 17, 19, 22, 23 |
Analyzing | 4, 6, 8, 9, 18, 20 |
Evaluating Creating | 24, 25, 26, 27, 28 |
The development and validation of the grammatical ability assessment software
The development of the grammatical ability software is consisted of three major phases. It commences with the actual writing of its code using the data that have been manifested from the results of the students’ test. The fields that will later on contain the different sections and features of the software will be put in place together with the areas where the teacher administrator can modify for future purposes.
As the temporary sole teacher user of the software, I am given access to the software through an administrator credential. Through this access, the proponent can add subjects, modify its details and instructions, edit the test items and answer keys, and adjust the transmutation weight. The administrator’s account can also access the summary of data once they are recorded to the software tool as respondents. The said access is as illustrated in the screenshots below.
Figure 4. Two Types of Accesses to the Software
Figure 5. Input Field for Administrator Access
The proponent of this study also inputs the test items and answer key to the software tool, completing all levels comprising the 7 word classes included in the test. The transmutation weight, average and percentage computation, and the grammatical ability level that will manifest as the results will likewise be encoded by the software developer through the guidance and instruction of the proponent. The transmutation weight for each word class is based from the results found in Table 7. The fields that may be modified by the administrator where the subject title, level number, code name, instructions, test items, and answer keys may be indicated are is illustrated below.
Figure 6. Input Field for Subject Details, Test Items, and Answer Key
Figure 7. Editable Input Field for Other Subject Details and Transmutation Weight
The software tool is run for a series of test trials to check for malfunctioning features and other bugs. Sample answers are encoded on the test tool and the results are checked for errors on the software’s automated checking, recording, and reporting results. In the series of test validations, several issues and bugs are encountered and they are as listed in the table below.
Table 9. Issues and Bugs Encountered During the Software’s Test Runs
Issue/Bug | Date reported | Date fixed |
The automated checking of the answers is not case sensitive. It counts answers as correct even if some letter casing are incorrect. | June 21, 2023 | June 22, 2023 |
The test items in the summary of results become duplicated when the administrator accesses the data. The test items also shuffle on the respondent access upon taking the test. | June 22, 2023 | June 22, 2023 |
After taking the test, the results do not reflect on the respondent access. Also, the transmutation weight on the administrator access only allows input of whole numbers. | June 23, 2023 | June 24, 2023 |
After fixing the previous bugs, the code responsible for showing summary of result on the administrator side is affected, leading it to stop working. The software developer also adds another feature for the administrator that he/she may now delete data in the summary of results. The software developer also corrects a few spelling errors on the software. | June 24, 2023 | June 24, 2023 |
After the issues and bugs have been fixed by the software developer, the software tool is now ready for the input of actual data from the students who have taken the grammatical ability test. Their correct and wrong answers are being recorded precisely and their grammatical ability average, percentage, and level are all being computed and reported accurately. A total of 100 test entries are encoded to the software for further validation testing. Once the data are in the software’s offline database, the teacher who has administrator access may now view the summary of answers and results. This access is as shown in the figure below.
Figure 8. Summary of Results via the Administrator Access
In accordance with the previous data shown in Table 5, the summary of results as presented by the software shows that verb is where the students most frequently commit errors on while pronouns are their least common mistakes. The breakdown of the actual error count for each word class is likewise presented on the right side of the window. Also, the individual error frequency and transmuted error of each respondent can be seen on the left most side.
The administrator may also check the result of individual respondents by clicking a name from the list found on the left side of the window. It will lead to a new window showing the actual answer of the selected respondent, showing their correct answers, mistakes, error count per word class, error percentage, transmuted error, grammatical ability score, grammatical ability percentage, and grammatical ability level. All of these can be seen in the illustration below.
Figure 9. Summary of Results of a Selected Respondent
Feedback of the teachers as tool administrators
Using a series of Wilcoxon Signed Rank Tests, the responses of the teachers as tool administrators in a 4-point Likert scale are analyzed. Table 12 shows the results on the evaluation of language teachers towards the software as it is run through the statistical tests.
Table 10. Wilcoxon Rank Test Results of the Grammatical Ability Assessment Software as Evaluated by the Teachers as Tool Administrators
Questionnaire Items | p-value, H0: µ = 4 ;Ha: µ > 4 | p-value, H0: µ = 3 ;Ha: µ > 3 | p-value, H0: µ = 2 ;Ha: µ > 2 | p-value, H0: µ = 1 ;Ha: µ > 1 | Decision, H0: µ = 4 ;Ha: µ > 4 | Decision, H0: µ = 3 ;Ha: µ > 3 | Decision, H0: µ = 2 ;Ha: µ > 2 | Decision, H0: µ = 1 ;Ha: µ > 1 | Description |
The software records inputs properly. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool responds to commands well. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool is free from bugs and other inconveniences. | 0.963 | 0.036 | 0.007 | 0.005 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool is portable and can be accessed offline. | 0.977 | 0.005 | 0.004 | 0.004 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool is installable to all kinds of computers/laptops. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software works properly in all kinds of computers. | 0.977 | 0.005 | 0.004 | 0.004 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software is easy to use. | 0.977 | 0.005 | 0.004 | 0.004 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The different features of the software are easy to learn. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s code is simple and is easily managed. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s code can easily be modified for improvements. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s management can be transferred easily to moderators or co-hosts. | 0.977 | 0.005 | 0.004 | 0.004 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software is portable to any device. | 0.988 | 0.212 | 0.010 | 0.006 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software can be adapted for similar research endeavors. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software can easily be debugged and troubleshoot. | 0.970 | 0.010 | 0.005 | 0.005 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s features are adjustable. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software as a whole is flexible to meet other similar purposes and processes. | 1.000 | 0.003 | 0.003 | 0.003 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool generally receives an excellent rating from the teachers with most of the mean scores being 4 or “Strongly Agree” on the positive descriptions of the tool. Although still very satisfactorily, some concerns that the teachers did not give a perfect rating to is the presence of minor bugs, which gets a mean score of 3.625. It is followed by the software’s portability to be accessed offline, its unavailability to other hosts such as mobile phones and tablets, as well as the software’s ease of use, which all get a mean score of 3.875.
Feedback of the students as tool users
Table 11. Wilcoxon Rank Test Results on the Grammatical Ability Assessment Software by the Students as Tool Users
Questionnaire Items | p-value, H0: µ = 4 ;Ha: µ > 4 | p-value, H0: µ = 3 ;Ha: µ > 3 | p-value, H0: µ = 2 ;Ha: µ > 2 | p-value, H0: µ = 1 ;Ha: µ > 1 | Decision, H0: µ = 4 ;Ha: µ > 4 | Decision, H0: µ = 3 ;Ha: µ > 3 | Decision, H0: µ = 2 ;Ha: µ > 2 | Decision, H0: µ = 1 ;Ha: µ > 1 | Description |
The software records inputs properly. | 0.997 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool responds to commands well. | 0.986 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool is free from bugs and other inconveniences. | 0.999 | 0.175 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool is portable and can be accessed offline. | 0.999 | 0.055 | < .001 | < .001 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software tool is installable to all kinds of computers/laptops. | 0.963 | < .001 | < .001 | < .001 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software works properly in all kinds of computers. | 0.998 | 0.02 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software is easy to use. | 0.995 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The different features of the software are easy to learn. | 0.998 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s code is simple and is easily managed. | 0.995 | 0.007 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s code can easily be modified for improvements. | 0.981 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s management can be transferred easily to moderators or co-hosts. | 0.997 | 0.004 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software is portable to any device. | 0.997 | 0.004 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software can be adapted for similar research endeavors. | 0.997 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software can easily be debugged and troubleshoot. | 0.998 | 0.02 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software’s features are adjustable. | 0.991 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software as a whole is flexible to meet other similar purposes and processes. | 0.998 | < .001 | < .001 | < .001 | Fail to reject H0 | Reject H0 | Reject H0 | Reject H0 | Strongly Agree |
The software tool generally receives an excellent rating from the teachers with most of the mean scores above 3.5 (or “Strongly Agree” if the mean scores would be rounded off) on the positive descriptions of the tool. Despite getting a satisfactory rating from the students, they share the same concern with the teacher administrators. The students also observe that the software tool lacks the portability and offline features (x̄ = 3.28). Moreover, the students also see that the software tool is not totally free of issues and bugs (x̄ = 3.17).
Feedback of the IT specialist as tool experts
Table 12. Wilcoxon Rank Test Results on the Grammatical Ability Assessment Software as Evaluated by the IT Specialists as Tool Experts
Questionnaire Items | p-value, H0: µ = 4 ;Ha: µ > 4 | p-value, H0: µ = 3 ;Ha: µ > 3 | p-value, H0: µ = 2 ;Ha: µ > 2 | p-value, H0: µ = 1 ;Ha: µ > 1 | Decision, H0: µ = 4 ;Ha: µ > 4 | Decision, H0: µ = 3 ;Ha: µ > 3 | Decision, H0: µ = 2 ;Ha: µ > 2 | Decision, H0: µ = 1 ;Ha: µ > 1 | Description |
The software records inputs properly. | 0.963 | 0.065 | 0.013 | 0.009 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software tool responds to commands well. | 0.986 | 0.074 | 0.010 | 0.010 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software tool is free from bugs and other inconveniences. | 0.990 | 0.940 | 0.172 | 0.017 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software tool is portable and can be accessed offline. | 0.990 | 0.967 | 0.425 | 0.027 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software tool is installable to all kinds of computers/laptops. | 0.990 | 0.973 | 0.556 | 0.049 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software works properly in all kinds of computers. | 0.990 | 0.964 | 0.412 | 0.049 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software is easy to use. | 0.988 | 0.386 | 0.015 | 0.010 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The different features of the software are easy to learn. | 0.963 | 0.065 | 0.013 | 0.009 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software’s code is simple and is easily managed. | 0.988 | 0.825 | 0.087 | 0.010 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software’s code can easily be modified for improvements. | 0.979 | 0.383 | 0.027 | 0.011 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software’s management can be transferred easily to moderators or co-hosts. | 0.981 | 0.212 | 0.016 | 0.010 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software is portable to any device. | 0.990 | 0.967 | 0.425 | 0.027 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
The software can be adapted for similar research endeavors. | 0.979 | 0.383 | 0.027 | 0.011 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software can easily be debugged and troubleshoot. | 0.993 | 0.681 | 0.013 | 0.009 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software’s features are adjustable. | 0.972 | 0.242 | 0.024 | 0.010 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Reject H0 | Agree |
The software as a whole is flexible to meet other similar purposes and processes. | 0.991 | 0.890 | 0.173 | 0.009 | Fail to reject H0 | Fail to reject H0 | Fail to reject H0 | Reject H0 | Disagree |
When it comes to the IT specialists, the software tool receives disapproval on its practicality. With most of its positive descriptions getting “Disagree,” the evaluators agree that the software tool has poor usability. As experts in software development, they have observed that the program has bugs and issues. They also find issues when it comes to the use of the software in other technological platforms such as other computers.
One commenter suggests the addition of more generated report to improve the quality of usage for the administrator. Another evaluator wants to emphasize that in terms of usability, the software runs only with Windows computers that have .Net. For the software’s ease of use, one IT expert highly recommends the addition of a “Help” feature.
Although one IT specialist commends the ability of the software to highlight a good program-to-user quality, the other evaluators have expert inputs on the software’s maintainability. One IT expert demands the addition of “notes” on the codes from the programmer which can be used as reference for maintenance and updates when checked by other programmers. In terms of debugging the software, users need to have knowledge on the C# programming language, thus, compromising its sustainability functions. In or-der to optimize functionality and to prevent future issues, another evaluator suggests the exportability of results to Excel files.
Summary of the Study
The development and testing of the grammatical ability software for limited English SHS students was aimed at the provision of an assessment tool of the user’s difficulties in grammar. Despite trends on the promotion of a communicative approach in the teaching and learning of English as a second language, Azar (2007) and Zhang (2009) still argued that grammar is the foreground of language acquisition. Neumann (2014) defined grammatical ability as “theoretical grammatical knowledge accurately and meaningfully in language use situations.” Thu, the development of the test items for the assessment tool was grounded on the concept that grammar facets shall be used in authentic and meaningful sentences. The grammatical ability test items were validated by two language experts prior to its administration.
When the student respondents finished taking the assessment, the taxonomy of errors manifested and the word classes were ranked according to their frequency of error. The taxonomy of errors’ frequency served as the basis for the assignment of weight transmutation which denoted the value equivalent of errors in each word class.
The software tool was developed containing the expert-validated test items and values predetermined in the first part of data gathering. The test items were ensured by the ESL experts to possess qualities that would assess the syntax and actual use of the language targeting the higher-order thinking skills in the revised Bloom’s taxonomy. Consequently, a table of specifications anchored to these results was developed. An IT developer was commissioned as collaborator for the software’s code writing. Through a series of validation testing, debugging and the fixing of issues were conducted to further improve the software’s functions and features.
The software tool was then presented to the English teachers of the research locale for evaluation as the tool administrators. An evaluation tool was handed immediately after the tool’s demonstration and their comments were taken into consideration in providing a more specific set of recommendations. The same evaluation was conducted among randomly selected SHS students as tool users and available IT specialists as tool experts. The results of this entire research also served as the basis for the construction of proposals for remediation programs among the limited English SHS students of the locale.
Summary of Results
After the grammatical ability assessment had been administered among the student respondents, it turned out that they frequently encounter difficulties on verbs and adverbs. The next highly ranked word classes in the taxonomy of errors were nouns, pronouns, conjunctions, and preposition. In these facets of grammar, students committed errors “sometimes”. Therefore, a low transmutation value was assigned to verb and adverb errors, followed by an average value to nouns, pronouns, conjunctions, and prepositions. Aside from the frequency of these occurrences, the nature of the errors was also analyzed for the construction of specific recommendations for the students’ remediation program in English.
These values, together with the matrix for the assignment of student’ grammatical ability levels, were input to the coding of the software tool. It was then tested for bugs and other issues, which were all quickly resolved by the collaborator. The finalization of the software led to the opportunity of its presentation to the English teachers of the research locale, SHS students, and IT specialists.
Consequent to the software’s demonstration, the three groups of respondents evaluated its usability, sustainability, and maintainability. In all items, the software received a collective mean score of 4 (strongly agree) on its positive features. The only items that got a mean score of 3.25 (agree) is the software’s adaptability to versions available to mobile devices. The qualitative statements of the evaluators supported this concern by stating that, although the software tool was already excellent as it was, a mobile version of the software would have been useful.
Limitations of the Study
Relevant to the actual construction of the test items to be included in the grammatical ability test, the researcher was limited to the inclusion of answer keys to satisfy the software’s automated checking features. As much as it was desired for the answers to be more students-given, the test was obligated to become fill-in-the-blank in nature. In line with this challenge, I was also obliged to provide choices to every question to limit the possibility of multiple correct answers. Also, the software’s automated checking feature is case-sensitive, thus, the correct answers were also limited to be precise with the answers encoded in the key. In summary, one key limitation that the grammatical test had had was it was made form-focused.
In the conduct of data gathering, there were also anomalies observed from the answers of the respondents. A portion of them committed casing errors such as uppercasing their answers unnecessarily. This noncompliance to the instructions led to their answers being recorded as mistakes. The repercussion of these observations meant that although the targeted number of respondents was met, there was a number of data that were considered outliers due to these noncompliance incidents. The grammatical ability levels of these selected students were also incorrectly reported after getting very low scores or even no score at all. Their noncompliance to the casing instruction led to them getting negative values in their frequency of errors as well as their grammatical ability scores. These data likewise became irrelevant when counting the frequency of errors per word class since they were observed to be unintentional mistakes. Although these errors seemed unintended, they were still considered inexcusable in answering language tests as such.
The unavailability of some English teachers during the demonstration of the software also led to certain limitations on its critiquing and evaluation. Out of the 12 teachers of English in the senior high school department, only eight teachers were present to witness the demo. Nonetheless, the comments and evaluation score given by those in attendance were deemed as sufficient in representing the evaluation of the English area.
Conclusion
Based on the actual findings of this study, the information provided by literature about the grammatical ability of language learners is partially verified. The results of this research are also utilized to establish a grammatical ability assessment software for the checking, recording, and reporting of students’ proficiency. Consequent to the successful development of the software tool, the following conclusions are specifically drawn from actual data gathered:
Recommendations
Subsequent to the above-listed conclusions, action-based statements may now be established for the advantage of this research’s beneficiaries. These references are seen as necessary to improve pedagogies in language teaching and learning based from the findings of this research. In full details, below are my recommendations as the ultimate conclusion to this study.
Sign up for our newsletter, to get updates regarding the Call for Paper, Papers & Research.
Sign up for our newsletter, to get updates regarding the Call for Paper, Papers & Research.