For the analysis, the 'Upbringing' textbooks from grades 1 to 11 (published in 2020-2021) were collected and converted into a textual format in
electronic form. To analyse the structure of the textbooks, we manually divided all the content into categories: 'How a child should look and behave'; 'Personal development, success, business'; 'Moral qualities and ethics'; 'Society (friends, mahalla, school)'; 'General knowledge’, ‘Homeland and patriotism'; 'Family and family values', 'Other'. We used the chapter titles to indicate the category. If it wasn’t clear from the title which category it belonged to, we relied on the text of the chapter.
For text analysis, we used the Python programming language and its libraries. To analyse the frequency of word usage, we cleaned the entire text of punctuation, conducted lemmatization, and removed stop-words. Then, using the Mystem library, we identified nouns and adjectives, excluding words that are headings of sections: question, activity (creative activity), reflection (for reflection), etc.
To compile a list of the most mentioned personalities, we used the Natasha library, which helped us count all the names in the text. We then manually corrected inaccuracies — the library does not recognize some names and surnames together well (for example, Amir Temur or Ibn Sina). An incomplete version of a proper name (only a surname or only a first name) was counted if it was clear from the text who was being referred to. Different spellings of the same name (for example, Avicenna and Abu Ali Ibn Sino) were also considered.
To analyse the connections between words, we created a list of words ('Homeland', 'Love', 'Value', 'Religion', etc.), and using code, we checked which words often appear in the same sentence as our selected words from the list.
In working on the project, we used automation tools, including methods of natural language processing. Practically any library working with them is not perfect and admits rare errors.
*The data visualisatoins and the text of the artcile were translated from Russian.