Chat with us, powered by LiveChat How to Use—and When to Avoid—Interview-Style Language Testing

ICMI is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


How to Use—and When to Avoid—Interview-Style Language Testing

talkingSponsored post: 

Around the world, when most contact centers want to screen for language ability, they consider an interview-style assessment. While there are several commercial test products that use interview-style tasks, these tests are generally very expensive and have inefficiencies in the return time for scoring. As a result, many organizations will move language screening processes in-house and use their existing language resources—if they have any—to conduct language screening interviews. Throughout my career, I have worked with language teachers, administrators, and other professionals to develop their skills in this type of assessment. If you plan to use an in-house solution for language assessment like an interview-style test, here are some tips to consider:

1. Interview in the mode that the position being filled most requires.

If you are hiring for a position that will mostly require face-to-face communication, interview face-to-face. If you are hiring for a position that will mainly use technology-mediated communication, interview in the mode most frequently used. A contact center should mostly use a non-face-to-face approach to interview. If you have experience learning a second language, you understand that face-to-face communication can be very different from communicating over the phone. How these differences affect performance varies greatly across candidates. Do not assume that performance in-person will always directly translate to performance in other modes—it does not.

2. Focus questions on language function.

I love getting to know people, and an interview assessment is a great way to do that. However, it is important to remain focused on the purpose of this particular interview, which is to elicit evidence of language ability. Questions should be specific to that objective. If you are interested in a description of some of the key functions that you should include in a well-rounded language ability interview, check out this list.

3. Record the interviews and rate the performance after-the-fact.

Even when questions are pre-planned, conducting an interview in a way that is consistent and fair requires a lot of concentration on your behaviors as an interviewer. It is difficult—if not impossible—to split concentration between executing the interview and simultaneously making precise and reliable evaluations about the language performance of a candidate. Taking a break between the stress of giving the interview and playing it back for scoring will ensure your rating is more reliable. It will also provide you the opportunity to more objectively review your own execution of the interview and improve your skill going forward.

4. Know your limits and pace yourself.

Although, we have identified a few things that will make your efforts more sustainable, a single interviewer can only do so many interviews before their fatigue will start to impact the performance of the person being interviewed. To avoid this, take breaks and spread the interviews out over an appropriate amount of time. An interviewer who does not do this will start to suffer from intra-rater reliability issues. This means an interview that they do or a rating they assign early in their effort will be notably different from an interview or rating assigned later in the cycle.

5. Calibrate multiple interviewers.

Most situations will require multiple interviewers, multiple days of interviews, or both. While either of these options increase the capacity for how many candidates could be screened, both also increase risks to reliability. In addition to extensive training before interviews begin, ongoing calibration is necessary. Having a pre-planned structure to the interview will help consistency, but before anyone scores an interview, they should listen to one or two previously scored samples and assign it a score. If the score they assign does not match the previously determined score, they should practice with a few more recordings to ensure that they are consistent with established standards.

If this sounds like a lot of work, it is! If this sounds expensive, it is!

An interview-style approach has a reasonably low startup cost, but it is unrelenting in its costs to maintain—particularly when reliability matters. Not only are there direct expenses in hiring, training, and maintaining the right staff to execute these screening interviews. There are indirect expenses in using a screening method that is simply less reliable than other forms of testing. A test is reliable if it is found that under different circumstances with different test takers it would return consistent results. Like a digital weight scale that produces the same outcome each time it measures a weight, a language test must produce the same score each time a candidate measures ability. An interview approach to language screening, even at its most refined, simply introduces too much potential variance for its reliability not to be at risk.

How much does it cost an organization to hire the wrong person or not hire the right person?

Considering the risks on both sides of this coin is significant. How much would it cost if a person hired does not actually have the language capacity they need to effectively do the work? False positives are certainly inconvenient and disruptive. Inversely, but just as likely of an outcome, how much would it cost if a person not hired did have the language capacity they needed to effectively do their work, and it just was not properly recognized by the interviewer? False negatives are also an expensive outcome. Can you afford to make the wrong choice? When you use a less-reliable assessment method, you are conceding that you will make wrong decisions more frequently. When you use a less-reliable assessment method and do not follow some of the best practices outlined above, reliability further deteriorates and the problem compounds.

Every language assessment has two components: 1) Execution of tasks that elicit language performance, and 2) Evaluation of the language performance that has been elicited. Even well-trained and well-supported interviewers and raters introduce error in both components. Fortunately, an interview-based approach is not the only solution to the problem of screening candidates for language ability. Not surprisingly, technology-mediated language testing solutions have improved greatly in their access and accuracy.

Organizations no longer have to rely on interview-style assessments to screen for language ability.

Technology-mediated language testing solutions use technology to elicit language performance. This presents significant advantages in efficiency and scalability. An interview approach can really only assess candidates one-at-a-time per interviewer. Most computer-based testing will be able to test all candidates that need to be tested at the same time and—very importantly—test them all under very similar conditions. However, most computer-based language assessments unnecessarily cling to inefficiencies and vulnerabilities in how they accomplish the second part, evaluation of the collected language performance, by using human raters.

If we have learned anything in 2020, it is that a solution is only as good as it is flexible. Coordinating an interview approach to language screening adds further complexity at a time when such complexity is increasingly costly. Artificial intelligence, like with the Emmersion’s TrueNorth Speaking Test, helps to simplify and improve the process so no more time and money is wasted wondering which candidates really are fit for the job.ilke