top of page

Analyze LLM of AI services for Niitive
Client
Sector
My Role
UX Researcher
Project Time
Niitive - based in Singapore
IT/Edtech
Oct-Nov 2024
Project Overview:
The Large Language Model(Azure AI services) used in Niitive App AI enables context-sensitive translations across languages, especially in nuanced areas like religious and educational content. This research evaluates its accuracy and contextual sensitivity, using Likert scale ratings and thematic analysis to identify strengths and improvement areas. The findings will guide enhancements, making the model more reliable and user-friendly for diverse audiences.
Research Goal:
To identify strengths and improvement areas in the model’s translation capabilities, enhancing its ability to deliver accurate, context-aware translations for diverse user needs.
Methodology:
This study involves 40 participants who are native speakers of key languages, including Telugu, Hindi, Tamil, Bengali, Mandarin, and other widely spoken languages. These participants have a moderate understanding of English, enabling them to assess the accuracy of translations by determining whether the output aligns with what a native speaker would expect in their language.
Participant Selection:
-
Language Diversity: Participants are selected based on their native language proficiency to ensure that translations are evaluated from multiple linguistic perspectives, particularly within Indian languages and Mandarin.
-
Cultural and Contextual Representation: To capture context-sensitive insights, participants are chosen from varied cultural and religious backgrounds, covering Christianity, Hinduism, New Age beliefs, Islam, Buddhism, and other religions.
-
Educational Backgrounds: Additionally, participants have familiarity with specific academic disciplines—such as Physics, Philosophy, Theology, Metaphysics, and Psychology—allowing them to evaluate educational content accurately.
Context Selection:
-
Religious Contexts: Participants review translations of content rooted in different religious beliefs to assess how well the model conveys theological nuances and cultural relevance.
-
Educational Contexts: Evaluations in educational content include translations of academic concepts in subjects like Physics and Philosophy, testing the model’s ability to preserve technical terms and concepts across languages.
Scenario Based Tasks
Task 1
Select an English educational video on a Physics concept, such as Newton's Laws, and translate it into your native language. Imagine explaining this to a high school student in your community. Evaluate if the translation maintains technical terms and simplifies complex concepts for clear understanding.
Task 2
Choose a philosophical lecture in English—perhaps on a topic like "The Nature of Reality"—and translate it into your native language. Assess if the translation preserves philosophical nuances, such as metaphors or abstract ideas, that align with the perspective of someone familiar with similar concepts in your language..
Task 3
Pick an English documentary clip on mental health or psychology, and translate it for a group of laypersons in your community. Evaluate if the translation captures sensitive and accurate terms related to mental health in a way that respects cultural perceptions and resonates with local understanding.
Task 4
Translate an English religious video (e.g., a sermon or discussion) for a mixed audience of various faith backgrounds within your community. Evaluate whether the translation conveys respect for diverse religious beliefs and accurately interprets spiritual language without losing essential meaning.
Task 5
Find a video lecture in English on a theological topic, such as "The Concept of the Soul," and translate it for an audience of theological students or practitioners. Review the translation to see if it effectively conveys complex theological ideas in a way that someone studying the subject would expect.
Thematic Analysis

Initial Coding excerpts





Conclusion:
-
Strengths:The AI model performed well in making complex topics accessible, with high ratings for clarity and ease of understanding.
Translations were generally effective for broad audiences, enhancing accessibility.
-
Improvement Areas:
Tone and Formality: Inconsistent tone alignment and formality, especially in religious contexts, made translations feel unsuitable for specific audiences, such as elders.
Cultural Sensitivity: Lacked cultural nuance in sensitive topics like mental health and religion, leading to potential misinterpretations.
Technical Accuracy: Variability in translating specialized terms (e.g., philosophy, science) impacted consistency and precision, critical for academic and professional use.
Recommendations:
Enhancing cultural sensitivity, tone adaptation, and consistent technical accuracy will help the AI model deliver translations that are both clear and contextually precise for diverse global users.
Conclusion:
-
Guided Improvements: The research findings highlighted key areas for enhancement in tone, cultural sensitivity, and technical accuracy. This led the AI engineering team to explore alternative language models to meet these needs.
-
POC Development: The team will conduct a proof of concept (POC) to evaluate custom models and training data that can better adapt to specific tones, cultural nuances, and technical terms in areas like religion and academia.
-
Custom Language Model: A custom model with specialized training data is planned to address tone appropriateness, consistency, and precise terminology, ensuring culturally relevant and contextually accurate translations.
This POC decided by engineers marks the first step toward refining the Niitive App AI Language Model to provide globally sensitive and reliable translations for diverse users.
bottom of page