Recent research by the Language Assessment Research Group.

Research on OLBI language tests

The Language Assessment Sector at OLBI has an enviable reputation for quality and efficiency in the area of language testing. We currently have over 30 separate testing projects, many of them for high-stakes gatekeeping purposes such as hiring and certification of language proficiency. We are consulted regularly for our expertise in areas related to language assessment, and we are committed to expanding our ongoing research and test validation activities.

The Language Assessment Sector invites proposals from graduate students and other researchers in the area of language assessment for research activities linked to any of our tests. Interested parties are encouraged to inquire with the Director, Language Assessment, about project ideas and the data and institutional support we have available for such projects. Project proposals are subject to approval by the Director and must obtain approval from the University’s Research Ethics Office.

Recent validation research

2022-2023: Alignment of the Second Language Certification Test to the CLB and CEFR

Language Testing Services and members of LARG carried out a major project to align OLBI’s Second Language Certification Test (ESL 3100 and FLS 3500) with the Canadian Language Benchmarks and the Common European Framework of Reference. The resulting table of equivalence will enable more meaningful interpretations of test takers’ language abilities for the public. This project was shared by Yongfei Wu (Head of Language Testing Services) and Rosalie Hirsch (Test Validation Officer) through a poster session at the 2023 meeting of the International Language Testing Association.

2017–2019: Creating a validity argument for an existing testing program for university admissions and professional certification

Angel Arias and Beverly Baker, OLBI, University of Ottawa and Carol Chapelle, Distinguished Professor, Iowa State University

This project involves a reorientation of our validation activities in our sector into a program of research informed by a validity argument framework. More specifically, our ongoing test validation activities are being re-interpreted in view of the backing they can provide for claims in the validity argument for our test interpretation and use. 

2017–2018: Mixed-methods development of a test-taker-oriented writing rating scale

Beverly Baker and Angel Arias, OLBI, University of Ottawa and Maryam Homayounzadeh, Shiraz University

This study tracks the mixed methods development of a test-taker-oriented rubric for the writing section of the CanTEST, for potential test takers.  As this test is used for credentialing requirements for internationally trained professionals as well as university admissions for international students, it is important to provide validated transparent tools to support test preparation.

2017: Validation Report on the TESTCan and CanTEST: Test Taking Processes and Cultural Bias Reviews

Prepared by the students of BIL5103 course: Evaluation of Second Language Competence

MA in Bilingualism Studies, Official Language and Bilingualism Institute

(Professor: Beverly A. Baker)

Elaaf Alsomali, Sarah Auyeung, Laura Castano Laverde, Majida Harb, Raquel Llama, Giselle Lehman, Stephanie Marshall, Nasren Musa Elsageyer

This team conducted think-aloud protocols on cloze and scanning questions of the TESTCan, as well as cultural bias reviews of CanTEST reading items, to inform test revision.

Program of research: Revising test specifications for the listening component of a high-stakes English assessment: A conceptual and data-driven approach (2015-2016)

Project Lead:

Angel Arias, Test Validation Officer, Official Languages and Bilingualism Institute

This program includes three current projects.

Project 1:

This project involves the revision of test specifications for the listening component of the CanTEST. The CanTEST is a standardized English proficiency test offered by OLBI to determine whether or not candidates meet the language requirements of Canadian postsecondary institutions and Canadian professional licensing associations. This project involves three stages:  data from representative samples of three test versions (n = 1,789; n = 1,550; n = 846) are being used to screen less optimal items. Second, new items are being reverse-engineered through an iterative process, drawing on the content of the items and the literature on second language listening (Buck, 2001; Lynch, 2011; Rost, 1990; Field, 2011; Flowerdew, 1995; Taylor & Geranpayeh, 2011). Finally, new item specifications will be drafted. This combination of a conceptual and data-driven approach to test specification revision will create an empirically grounded and defensible document to develop future test versions of the CanTEST.

Project 2:

In the language testing community, it has become common practice to use fixed rules of thumb or guidelines on numerical ranges for acceptable residual-based fit statistics to assess item and person fit (e.g., Aryadoust, Goh & Lee, 2011; Eckes, 2005; Hsieh, 2013; Winke, 2011). However, the use of fixed numerical ranges for misfit diagnosis across different testing situations varying in sample size leads to both overdetection and underdetection of misfit. This study reports the results of a Rasch analysis that takes into account sample size for misfit diagnosis when considering fit mean square statistics. The data (n = 845; n = 1,548; n = 1,759) stemmed from three listening subtests of the CanTEST and the focus of the analysis was to flag misfitting items for review purposes. The results suggest that adhering to sample-sensitive criteria for misfit diagnosis in Rasch measurement helps in signalling misfitting items that may not be flagged under traditionally fixed cutoff criteria. Results have implications for item review processes that use misfit information as an indicator for item fine-tuning. In addition, inferences about item quality and person standing on a given language construct are more precise and theoretically founded when misfit diagnosis is based on sample size. 

Project #3: Skimming, scanning, and reading comprehension: An exploration of construct through exploratory factor analysis

Skimming and scanning are emphasized in the classroom and in textbooks as important and useful reading strategies. Intuitively, the importance of emphasizing such strategies seems justified, on the grounds that people need to quickly digest large amounts of reading material in order to cope with the information explosion in academic and professional contexts. However, there is almost no research on the construct(s) underlying skimming and scanning, as well as their relationship to overall reading proficiency. This study was undertaken to collect theoretical, empirical and practical evidence to justify the continued use of the skimming/scanning subtest of the CanTEST.

Testing bilingualism: Incorporating translanguaging into a listening task for university professors

Project Lead: Amelia Hope, Head of Language Testing Services

This validation study accompanies the development of a translanguaged French and English listening task for professors at the University of Ottawa. Translanguaging can be defined as a dynamic process whereby resources from more than one language are employed in meaning making (Garcia, 2009). As professors have the right to use either English or French in all contexts on campus, meetings frequently alternate between the two. Therefore, passive bilingual ability for this task was operationalized as the ability to follow meetings that alternate between French and English, with production in the L2 not being required. These tasks will be piloted and feedback will be solicited about the new task through a survey of test takers and other test stakeholders, with specific questions related to the use of both languages within the task.

Oral admissions testing for university entrance: Interactive functions and perceptions of anxiety (2015)

Project Supervisor: Beverly Baker, Director, Language Assessment

Research Team: Joselyn Brooksbank, Irina Goundareva, Valerie Kolesova, Jessica McGregor, & Mélissa Pesant

This small scale validation project examined the telephone interview used for university admissions purposes, focusing on test taker anxiety and profiles of interactional functions elicited in French and English on identical tasks.  Audio responses were analysed with an adapted observational checklist (O’Sullivan, Weir, & Saville, 2002).  A questionnaire was also developed to investigate the test-takers’ comfort levels during the telephone interviews. Additional details about this project can be furnished upon request.