Skip to main content


Supporting machine-learning advancement of underwriting processes using online surveys

How does online data collection help improve the insurance industry and premium calculations?

Supporting ML advancement using online surveys

The wealth of culture, its proximity to major markets, its healthy business environment, and friendliness to foreign investment make Morocco an ideal choice to pursue or continue your business in the country.

Project Background:

Traditionally, insurance companies rely on actuaries to calculate premiums. Actuaries are professionals who use mathematical and statistical modeling to assess risk and estimate the likelihood of a claim based on various variables like age and gender. They use this information to create actuarial tables which are then used by the insurance company's underwriting department to calculate premiums. However, with the advancement of technology, specifically Machine Learning (ML) and dedicated software, insurance companies can now calculate and write programs more efficiently.

Artificial Intelligence (AI), particularly ML, is a promising technology that has been successful in various aspects of the insurance process such as image detection, sentiment analysis, and anomaly detection. However, these models require accurate input to train them effectively. The common problem is that a large portion of the input comes from the applicant, making it susceptible to errors due to the declarative nature of the data. Therefore, using validation input that is different from user-stated data would provide significant value in improving the quality of future premium calculation models.

One of our clients approached us with a brief, asking us to support the process of obtaining actual image-based data to train these models and improve the calculation of insurance premiums. We understood that commonly used platforms such as M-Turk were not an option due to their low-quality output and standardized approach. Therefore, we proposed alternative solutions that would provide higher quality and more relevant data for training the models.


Applying algorithms trained on images of fair skin can lead to inaccuracies for individuals of color, as certain lifestyle factors may manifest differently for those with skin tones other than Caucasian. As ML models are still in their infancy, we have the opportunity to improve these studies by providing proper data input that ensures this lifestyle recognition technology is inclusive for patients of all ethnic and racial backgrounds, and can be deployed in markets with diverse racial structures.

The markets we selected presented a significant challenge from a fieldwork perspective, as the mobile devices used varied greatly. This brought a number of technical challenges in providing the quality of input needed to meet modeling expectations.

Supporting ML advancement using online surveys
Project took place in 5 diversified markets (Japan, Hong Kong, South Africa, India and Taiwan).
We also had to consider ethical concerns related to the use of data. It was crucial to provide clear instructions and take measures to mitigate legal risks to ensure the responsible use of data.

Project Objectives:

The objective was to train an AI model designed for the insurance industry and premium calculations using human input from countries of Asia (Taiwan, Japan, South Africa, India and Hong Kong).


The project methodology consisted of two parts:
  • Country: Taiwan, Japan, South Africa, India, Hong Kong
  • Sample size: N = 2500 online interviews (N = 500 per country)
  • Part 1: Have participants answer a 15 minutes questionnaire about their health and daily habits (food consumption, BMI, smoking or drinking behaviors, exercising, diseases, etc.).
  • Part 2: Obtaining panelists/research participants' high-quality selfie photos.
Project organisation chart.

Survey questions:
  • Did you use makeup in this picture?
  • Do you currently smoke?
  • Have you ever been a smoker and smoked for more than a pack a month?
  • What is your current level of daily physical activity?
  • How many biological parents, grandparents, aunts, uncles, and siblings do you have that are aged 85 and older (including deceased relatives)?
Scope of work: recruitment, localization and translation of materials, data collection and data cleaning and processing.


At TGM Research, we understand the importance of accurate and reliable data when it comes to Machine Learning (ML) systems. That's why we've developed a unique technology specifically for this custom research project, integrating a validation module that utilizes participants' webcams to capture high-quality selfies.

This validation process ensures that only participants who provide selfies that meet the client's specific requirements (such as removing glasses, hats, and ensuring proper lighting) can participate in the survey. This allows us to map a wide range of data points, including facial features and skin characteristics, alongside information about participants' lifestyles, such as whether they smoke or not.

We take privacy and data protection seriously at TGM Research. All data is anonymous and follows ESOMAR principles, with clear instructions about the use of submitted information. We ensure that no sensitive data is processed alongside information that could potentially identify participants, in compliance with all codes of conduct.
Exhibits displaying part of the content collected during the research process.
Exhibits displaying part of the content collected during the research process.
With this methodology and unique validation approach, we rolled out the project in 5 countries, including South Africa and India, where the quality of smartphones and camera resolution is a challenge.

We were able to develop a fast and smart custom solution for one Client - a solution that no other panel companies have, to provide data of a total N=2500 (500 valid anonymized selfies per country) with additional questionnaire input to be used for data training.

Client's feedback:

“We were looking for a company to provide us with visual and questionnaire data in multiple regions around the world, and TGM delivered. TGM took the time to understand our difficult and challenging project and was flexible in meeting our dynamic goals. They stayed in communication with us through the process, and always returned data in a timely manner. We enjoyed working with the team at TGM and recommend them for your data collection needs.”
*Client information is not presented in this case study as per request. TGM received approval for publishing this project information from the Client.

As the leading online data collection agency, TGM Research conducted multiple market research projects in Asia and Africa. To learn more about our other projects and expertise, please contact us.

We hope you have found this short case study useful.

If you have any further questions about how best to set up your online research study in Asia or MENA region, please don’t hesitate to get in touch with us.

Others to read

You never know what you might discover! Explore the additional information and insights from across the world.

Automated to perfection

Open-ended response coding and how AI can take the market research into a new era