Type of site | Crowdsourcing, Microwork |
---|---|
Available in | English, Russian, Spanish, French, Arabic etc.[1] |
Founded | 2014 |
Country of origin | Russia, Switzerland[2][3] |
Owner | YandexInc |
Founder(s) | Olga Megorskaya |
URL | toloka |
Toloka is a crowdsourcing platform and microtasking project launched by Yandex in 2014[2] to quickly markup large amounts of data, which are then used for machine learning and improving search algorithms.[4] The proposed tasks are usually simple and do not require any special training from the performer.[2] Most of the tasks are designed to improve algorithms that are used by modern technologies spanning self-driving vehicles, smart web searches, advanced voice assistants and e-commerce. Upon completion of each task the performer receives a reward based on the volume of images, videos, and unstructured text.[3] The service has two app versions – for Android and iOS.
About Toloka
Origin of the platform's name
A toloka used to be a form of mutual assistance among villagers of Russia, Ukraine, Belarus, Estonia, Latvia, and Lithuania. It was organized in villages to perform urgent work requiring a large number of workers, such as harvesting, logging, building houses, etc. Sometimes a toloka was used for community works (building churches, schools, roads, etc.).[3]
Types of tasks and scope of results
Data labeling helps to improve search quality and effectively tune result ranking algorithms of search engine.[3]
Machine learning
To train machine learning algorithm requires labeling of large volumes with positive and negative examples of data. Toloka performers receive tasks to determine the presence or absence of objects defined by a computer in a content item.[3][5] In tasks of another type, a context of the dialogue is given and a scale is proposed by which it is necessary to assess whether a chatbot's answer in this context is appropriate, interesting, and so on.[6] Another group of tasks in Toloka is translation verification performed by collecting examples of translations from different performers.[7]
Audit and marketing research
Checking the quality of the online store, delivery service, writing reviews about products and services. Such audits allow to control the quality of the service and identify weaknesses, over which work will be carried out in the future to improve and eliminate the identified problems.[8][9]
Payment and money withdrawal
Completion of tasks in Toloka are paid. It is possible to withdraw funds using several payment systems: Payoneer, Qiwi (only for tolokers from some CIS countries), Papara (only for tolokers from Turkey). Also, money can be withdrawn to the YooMoney and SBP for self-employed users. Money withdrawals usually take anywhere from several hours to several days, but it could sometimes take even longer. The maximum transfer time is 30 days.[10]
Users
Toloka users, also known as performers or tolokers, are people who earn money by completing system testing and improvement tasks on the Toloka crowdsourcing platform. In 2018, more than a million people participated in Toloka projects. Most performers are young people under 35 (usually engineering students or mothers on maternity leave). Performers mainly see Toloka as an additional source of income, but many of them note that they like to do meaningful work and clean up the internet. As of March 2022, Toloka has 245,000 monthly active performers in 123 countries. Tolokers generates over 15 million labels per day.[1][11]
Requesters
All tasks in Toloka are placed by requesters. The main uses of Toloka are data collection and processing for machine learning, speech technology, computer vision, smart search algorithms, and other projects, as well as content moderation, field tasks, optimization of internal business processes.[3]
Toloka Research
In May 2019, the service's team started publishing datasets for non-commercial and academic purposes to support the scientific community and attract researchers to Toloka. Such datasets are addressed to researchers in different directions like linguistics, computer vision, testing of result aggregation models, and chatbot training.[12] Toloka research has been showcased at a range of conferences, including the Conference on Neural Information Processing Systems (NeurIPS),[13] the International Conference on Machine Learning (ICML)[14] and the International Conference on Very Large Data Bases (VLDB).[15]
References
- 1 2 "It helps me learn and earn: Toloka reports results of a global survey of Tolokers in 2022". toloka.ai. 2022-03-23. Retrieved 2022-09-16.
- 1 2 3 "Toloka rolls out 20000 new jobs opportunities for Ghanaians". Ghana Education News. 2021-06-15. Retrieved 2022-09-17.
- 1 2 3 4 5 6 Alex Woodie (2021-04-27). "Toloka Expands Data Labeling Service". Datanami. Retrieved 2022-09-17.
- ↑ Daria Baidakova (2021-09-29). "Data-Labeling Instructions: Gateway to Success in Crowdsourcing and Enduring Impact on AI". Data Science Central. Retrieved 2022-09-17.
- ↑ Frederik Bussler (2021-12-07). "Data labeling will fuel the AI revolution". VentureBeat. Retrieved 2022-09-17.
- ↑ Kumar Gandharv (2021-04-29). "Why Are Data Labelling Firms Eyeing Indian Market?". Analytics India Magazine. Retrieved 2022-09-17.
- ↑ Magdalena Konkiewicz (2021-12-16). "Human in the loop in Machine Translation systems". Towards Data Science. Retrieved 2022-09-16.
- ↑ Magdalena Konkiewicz (2022-03-29). "Evaluating search relevance on-demand with crowdsourcing". Towards Data Science. Retrieved 2022-09-17.
- ↑ "Guest post: Data Labeling and Its Role in E-commerce Today – Recent Use Cases". TheSequence. 2022-01-16. Retrieved 2022-09-16.
- ↑ "Withdrawal methods". toloka.ai. Retrieved 2022-09-16.
- ↑ "Olga Megorskaya/Toloka: Practical Lessons About Data Labeling". TheSequence. 2021-10-27. Retrieved 2022-09-16.
- ↑ "Toloka to present new dataset at prestigious Data-Centric AI workshop launched by Andrew Ng". The AI Journal. Retrieved 2022-09-17.
- ↑ "Toloka to present new dataset at prestigious Data-Centric AI workshop launched by Andrew Ng". FE News. 2021-11-18. Retrieved 2022-02-10.
- ↑ "Toloka". icml.cc. Retrieved 2022-02-10.
- ↑ "VLDB 2021 Challenge". crowdscience.ai. Retrieved 2022-02-10.