top of page
How to learn a language quickly.jpg

Sifting: How to Choose a Foreign Language Text for YOUR Level

This tutorial will help you choose the best textbook or reader for your UNIQUE level

There are simply hundreds of ways teachers of world languages have graded the difficulty of their materials (see the tutorial on graded readers, here). When I first started teaching almost 20 years ago, I had to use zero-beginner, beginner, elementary, pre-intermediate, lower-intermediate, upper-intermediate, and advanced. Oh, and there was also proficient. Then, the boffins at the EU devised their own levels, ranging from A1 -C2 (A1 being the lowest). Then I started learning Chinese and began working with the HSK system (1-11). Eventually, that was revised and new now HSK only comes in 6 (much easier!) levels.

These are useful labels for the universal characterisation of broad levels, but often they don’t reveal whether a particular text is suitable for a particular person. 

In this article, I’ll teach you how to use an easy and fun way to immediately gauge the suitability of a text in accordance with YOUR UNIQUE level*. As usual, I’ll use the What? Why? How? section titles to structure the tutorial. Feel free to jump around. 


I created Sifting after watching one of Professor Arguelle’s instructional series regarding the suitability of a text (here). He argues that the sweet spot of acquiring a new language through reading comes at about 95%-98% comprehension of a text. My take-home from those videos was to use this information to create an easy, convenient way to apply his theories in my own study regime so that we have an objective marker to gauge suitability; a prepackaged method to use in our toolkit for instant application. 



To ask why Sifting is important requires that we review some of the findings in Professor Arguelles’ aforementioned video series, who was himself building on Krashen’s fundamental ideas of Studying versus Acquiring a language. 

Without going down a rabbit hole of linguistic theory, let’s look at studying versus acquiring a language:


Study: The slow, careful process of intentionally retaining linguistic information (memorising word lists, verb forms, gap fill, etc);


Acquisition: The subconscious process of learning a language through listening and reading.


For Krashen, Arguelles, and your humble author, language for the most part should be acquired rather than studied. To acquire a language through reading alone relies entirely on the difficulty of the text; if the text is too hard, then we can only learn a new word by looking it up. If the text is too easy, we will never improve. However, if the text is just right, then we can acquire new words by contextualising them. Here’s an example:


I bloinked the dog.


As a native English speaker, this ridiculous word still has meaning as I can subconsciously extract a lot of useful information. I know:


  1. Bloink is a regular verb (-ed ending);

  2. Bloink is transitive (it can take an object - I can bloink something- in this case the dog).


Now, if this is one of the only words that I don’t know on the page (unknowns), I can continue on with the text without my overall comprehension suffering too much. Imagine later I come across this in the same text:


Then, I bloinked the fish and the rabbit.


So, I can subliminally extract semantic information from these examples:


Bloink + dog / fish /rabbit = 

Bloink + animal


I am building on what psycholinguists call a mental representation**.


Then, further along in my (weird) book, I come across this sentence:


My dog growled because I bloinked him for two days. My fish died because I bloinked it for too long and my rabbit looked sick because I had bloinked it for over 24 hours.


What does bloink mean? Obviously it’s nonsense, but the meaning I have arbitrarily assigned to it for the purpose of this tutorial is . . . wait for it . . .


Let’s review. Bloink:


  1. Is a regular transitive verb: FORM

  2. Takes an animal as its logical object: FUNCTION

  3. Is a negative action, leading to threatening behaviour (dog), death (fish), and sickness (rabbit): FUNCTION

  4. The negative qualities of bloinking manifest after a certain (relatively long) duration. After 2 days (dog); too long (fish); over 24 hours (rabbit).


The idea I have for bloink is not feeding an animal (my ten year old daughter guessed that it means mistreat, which is a perfectly sensible supposition). Obviously, these examples are extreme as these clues are neatly distributed among a short text***, but the process, by and large, is identical across different texts.


What Professor Arguelles discovered was that if we understand 95%-98% of words in a text, we accomplish two things:


  1. We can understand enough of the text to continue without confusion, and

  2. We will not understand just enough of a text to force progress


For example, if we understand less than 95%, then the accumulation of unknowns (words we don’t know) will cause an exponential increase in the overall degree of ambiguity. In other words, the story will become increasingly vague to the point of frustration. Imagine a character who is described as being siden. Is he a good or a bad character? Now, the siden man yesints his neighbour. What’s siden, and what does it mean to yesint somebody? The unknowns accumulate too rapidly and the reader’s frustration and confusion builds up as a result. It’s when unknowns build up too quickly that we began to lose interest and give up. But even if we were to march on stoically, we’d have no idea what was happening so reading without a dictionary would be a waste of time. It is known by too few that if we need a dictionary for every paragraph, what we are doing is intensive reading, rather than extensive reading, and this is the process of studying a language rather than acquiring it.


Coming back to the example, if we knew that yesint means to thrash with a broom, we could posit that the character was bad and make an educated guess about the word siden. Professor Arguelles demonstrates quite convincingly that if the number of unknowns exceeds 5%, our ability to make accurate assumptions decreases as a function of the rate of unknowns. As for the 98% range, this is important in that the learner needs to be exposed to enough unknowns in order to increase his/her vocabulary, and that less than 2% of unknowns would make the text too easy. I don’t know if a text can be too easy, but it’s a useful benchmark and one that I personally use (what do you think? Leave a comment below!). 


Going back to traditional levels and why they are often not as good at gauging our individual and unique level; I am passionate about history and have accumulated a relatively advanced vocabulary in several languages relating to certain aspects of this subject. For example, at the time of writing, I can understand words relating to witchcraft in French, but cannot formally apply for a job in the language. Now, witchcraft is generally thought to be more of an advanced, or niche, area than a job application, but for me, it’s the other way around. For me, a text in a French B1 textbook entitled Histoire de sorcières (History of Witches) would be easier than a text about common occupations at A2. And that’s understandable; traditional levels have to deal with the universal, rather than the particular. The compiler of Chinese textbooks didn’t know that I couldn’t talk about ice-skating in my third year of Chinese but was able to discuss the civil examination system of the Ming Dynasty better than most natives. Sifting will show YOU which text suits YOU.


So, let’s take a look at how we can use this method to gauge a text’s readability.



Here is a table of the basic steps of Sifting. I’ll explain each step in sightly more detail below:

Screenshot 2020-08-19 at 08.03.54.png

Step One: Gather resources. Find a book or text that you like and think is suitable for your level.


Step Two: Count the first 100 words and put a mark to indicate the 100th word, even if it’s in mid-sentence.


Step Three: Underline all the unknowns in the selection. Count them. This is your average level of comprehension for the whole book/text. For example, if you find three unknowns, then that would mean on average that you didn’t understand 3% of the book = 97% compression rate. If your score is between 95%-98%, then you can start.


Step Four: If your score falls below 95%, then leave the book and come back in one week (assuming you would have done other work in the language since then). It is important that you don’t actively seek out the meaning of a word by dictionary or other means. The idea is that you know this level of vocabulary naturally.


Step Five: When you pick up the book again in 7 days’ time, continue to tick off words you know. Hopefully, your vocabulary would have increased in one week so your score will be higher. If you still don’t know enough, then put it away again for another week. Once you understand 95%-98% of the words, you can begin the text****.



Here are a few examples from my latest attempts at learning French:

Screenshot 2020-08-19 at 08.42.23.png

Above is the first page of an A1 French reader. I started reading this about nine months ago and decided it was far too difficult (87% comprehension rate). I came back each week and found that I was ready to study the text after a few weeks of studying other textbooks (Assimil and Cortina). It was invigorating ticking the words off each week!


Update: I’ve just done the test on three B1 readers, and scored 96%, 96%, and 99%. I’m pretty happy that I have a 97% comprehension rate of B1 french texts :) I’m actually studying A1 texts and am just about to start a three-month reading challenge at this level. I’ll continue to work slowly through the levels (even though it seems I could completely skip A2). I enjoy extensive reading and opt for a solid foundation rather than a cursory understanding.  

Screenshot 2020-08-19 at 08.42.06.png

There it is; Sifting as a way to gauge your own, unique level regarding a text. It may occur that the first 100 words are disproportionately easy/difficult relative to the rest of the text. If you feel that this is so, simply do the same with the next 100 words (101-200). If you want to get the exact number of unknowns as a percentage, use the formula:


SCORE = no. of words / no. of unknowns


For example, if there were 453 words in a texts and you don't know 52 of them, your comprehension rate would be 87%.


There are occasions where sifting is not an ideal solution. For example, novellas in Latin are wonderful ways to learn grammar. New words would often be very few because the focus of the text isn’t necessarily vocabulary acquisition. Another time this method might be inappropriate is with an intensive reading text (which I believe too many people study too often). Sometimes, we simply have to work at some texts with our toolbox of dictionaries and cannot avoid the grind. 



Sifting is a method devised for gauging the difficulty of a text. It is different from other grading schemes as it tries to account for a personal and unique level, as opposed to a generic one. The method applies the theoretical work done by such linguists as Krashen and Arguelles in terms of the suitability of material for language acquisition. It is also highly motivating to visualise the usually abstract quality of progress by ticking “acquired” items off a list (lexis/vocabulary). However, some texts may be disproportionately easy/difficult at the beginning, which may skew the results, although this particular problem is easily remedied. 

As it is a convenient, fairly accurate way of assessing the readability of a text in accordance with one’s own UNIQUE level, sifting is a wonderful tool to know about for any language learner. 


I hope you enjoyed this article and can help spread the word by sharing! As usual, I’d love to hear your comments on this method!


Challenge: Pick out a few texts that you would like to Sift. Post your results in the comments section and keep us updated!


*I devised this method primarily for readers, but it can be used when applied to any lengthy section of text of 100 words or more.

**Don’t be fooled by the name; psycholinguists aren’t killer linguists (that would be psychotic linguists), but rather the devilishly delicious world of the psychology of language.

***If educators and content creators are reading this, I hope they might consider these issues of vocabulary acquisition.

**** I’ve found this system to be incredibly motivating and comes with many psychological rewards of positive habit formation.

bottom of page