Targeted Crowdsourcing Using App/Interest Categories Of Users
Part of my inquiry is on crowdsourcing. Basically, crowdsourcing agency performing micro-collaborations with many people to consummate a task. You separate the trace of piece of occupation into microtasks as well as outsource it to people. They furnish solutions to your microtasks, as well as yous aggregate those to obtain the solutions to the microtasks, as well as and thus ultimately to your task.
Aggregating the responses from the crowd is a challenge of itself. If the questions are asked every bit opened upwardly ended questions, the answers would come upwardly inward a multifariousness of types, as well as yous would non live on able to aggregate them automatically with a computer. (You may purpose human news 1 time to a greater extent than to aggregate them, but how are yous going to aggregate/validate these side past times side grade aggregators?)
To simplify the aggregation process, nosotros purpose multiple-choice inquiry answering (MCQA). When the answers are provided inward choices, a, b, c, or d, they acquire out unambiguous as well as easier to aggregate with a computer. The simplest solution for aggregation of MCQA is the bulk voting: whichever choice was chosen most is provided every bit the ultimate answer.
Recently, nosotros started investigating MCQA-based crowdsourcing inward to a greater extent than depth. What are the dynamics of MCQA? Is bulk voting adept plenty for all questions? If not, how tin nosotros practise better?
To investigate these questions, nosotros designed a gamified experiment. We developed an Android app to allow the crowd reply questions with their smartphones every bit they scout the Who Wants To Be A Millionaire (WWTBAM) quiz exhibit on a Turkish TV channel. When the exhibit is on air inward Turkey, our smartphone app signals the participants to pickup their phones. When a inquiry is read past times the exhibit host, my PhD students would type the inquiry as well as answers, which would live on transmitted via Google Cloud Messaging (GCM) to the app users. App users play the game, as well as savor competing with other app users, as well as nosotros acquire a adventure to collect precious information virtually MCQA dynamics inward crowdsourcing.
Our WWTBAM app has been downloaded as well as installed to a greater extent than than 300,000 times as well as has enabled us to collect large-scale existent information virtually MCQA dynamics. Over the catamenia of nine months, nosotros convey collected over iii GB of MCQA data. In our dataset, at that topographic point are virtually 2000 alive quiz-show questions as well as to a greater extent than than 200,000 answers to those questions from the participants.
When nosotros analyzed the information nosotros collected, nosotros constitute that bulk voting is non plenty for all questions. Although bulk voting does good inward the elementary questions (the starting fourth dimension v questions) as well as achieves to a greater extent than than 90% accuracy rate, every bit the questions acquire harder, the accuracy of bulk voting plummets rapidly to 40%. (There are 12 questions inward WWTBAM. The inquiry difficulty increases with each question. Questions 10, 11, 12 are seldom reached past times the quiz contestants.)
We as well as thus focused on how to improve the accuracy of aggregation. How tin nosotros weigh the options to give to a greater extent than weight to right answers as well as allow them win fifty-fifty when they are inward the minority?
As expected, nosotros constitute that the previous right answers past times a player betoken higher likelihood of beingness right inward this answer. By collaborating with colleagues inward information mining, nosotros came upwardly with a page-rank similar solution for history-based aggregation. This solution was able to heighten the accuracy of answers to 90% for fifty-fifty the harder questions.
We also observed some unexpected findings from the information collected past times our app. Our app collected the response fourth dimension of the participants, as well as nosotros saw that the response fourth dimension has some correlation to right responses. But the relation is funny. For the easier questions (the starting fourth dimension 5), before responses are to a greater extent than probable to live on correct. But for the harder questions, delayed responses are to a greater extent than probable to live on correct. We are yet trying to encounter how nosotros tin position this observation into adept use.
Another surprising final result came recently. One of my PhD students, Yavuz Selim Yilmaz, proposed a elementary approach, which at the terminate provided every bit effective every bit the sophisticated history-based solution. This approach did non fifty-fifty purpose the history of participants, as well as that makes it to a greater extent than applicable. Yavuz's approach was to /use the interests of participants to weigh their answers/.
In social club to obtain the interests of the participants, Yavuz had a real prissy idea. He proposed to purpose the category of the apps installed inward the participants phone. Surprised, I asked him how he plans to larn the other apps installed inward the player phones. Turns out this is 1 of the basic permissions Android gives to an installed app (like our WWTBAM app): it tin query as well as larn virtually the other installed apps inward the users phone. (That it is this tardily is telling virtually Android privacy as well as security. We didn't collect/maintain whatever identifying information on users, but this permission tin potentially live on used for bad.)
Yavuz assigned involvement categories to participants using Google Play Store's predefined 32 categories for apps (e.g. Books as well as Reference, Business, Comics, Communication, Education, Entertainment, Finance). If a player has to a greater extent than than v apps installed inward 1 of these categories, the player was marked every bit having involvement inward that category. We used one-half the information every bit grooming laid as well as constitute which involvement categories make the highest accuracy for a given inquiry number. Then inward the testing set, the algorithm is merely to purpose bulk voting with the category which is deemed most successful for a given inquiry number. Is this also simplistic an approach?
Lo as well as behold, this approach lifted the accuracy to roughly 90% across all grade of questions. (This newspaper got the outstanding newspaper honor inward the Collaboration Technologies as well as Systems (CTS 2014) Conference)
Ultimately nosotros desire to adopt the MCQA-crowdsourcing lessons nosotros learned from WWTBAM inward social club to prepare crowdsourcing apps inward location-based recommendation services.
Another application surface area of MCQA-crowdsourcing would live on performing marketplace position research. A lot of people inward the industry, consumer goods, music, as well as politics are interested inward marketplace position research. But marketplace position inquiry is hard to acquire right, because yous are trying to predict if a production tin acquire traction past times bespeak virtually it to a little subset of people which may non live on real relevant, representative. The context as well as interests of the people surveyed are of import inward weighing out the responses. (I promise this weblog post service volition live on used inward the futurity to kill some stupid patents proposed on this topic ;-)
Aggregating the responses from the crowd is a challenge of itself. If the questions are asked every bit opened upwardly ended questions, the answers would come upwardly inward a multifariousness of types, as well as yous would non live on able to aggregate them automatically with a computer. (You may purpose human news 1 time to a greater extent than to aggregate them, but how are yous going to aggregate/validate these side past times side grade aggregators?)
To simplify the aggregation process, nosotros purpose multiple-choice inquiry answering (MCQA). When the answers are provided inward choices, a, b, c, or d, they acquire out unambiguous as well as easier to aggregate with a computer. The simplest solution for aggregation of MCQA is the bulk voting: whichever choice was chosen most is provided every bit the ultimate answer.
Recently, nosotros started investigating MCQA-based crowdsourcing inward to a greater extent than depth. What are the dynamics of MCQA? Is bulk voting adept plenty for all questions? If not, how tin nosotros practise better?
To investigate these questions, nosotros designed a gamified experiment. We developed an Android app to allow the crowd reply questions with their smartphones every bit they scout the Who Wants To Be A Millionaire (WWTBAM) quiz exhibit on a Turkish TV channel. When the exhibit is on air inward Turkey, our smartphone app signals the participants to pickup their phones. When a inquiry is read past times the exhibit host, my PhD students would type the inquiry as well as answers, which would live on transmitted via Google Cloud Messaging (GCM) to the app users. App users play the game, as well as savor competing with other app users, as well as nosotros acquire a adventure to collect precious information virtually MCQA dynamics inward crowdsourcing.
Our WWTBAM app has been downloaded as well as installed to a greater extent than than 300,000 times as well as has enabled us to collect large-scale existent information virtually MCQA dynamics. Over the catamenia of nine months, nosotros convey collected over iii GB of MCQA data. In our dataset, at that topographic point are virtually 2000 alive quiz-show questions as well as to a greater extent than than 200,000 answers to those questions from the participants.
When nosotros analyzed the information nosotros collected, nosotros constitute that bulk voting is non plenty for all questions. Although bulk voting does good inward the elementary questions (the starting fourth dimension v questions) as well as achieves to a greater extent than than 90% accuracy rate, every bit the questions acquire harder, the accuracy of bulk voting plummets rapidly to 40%. (There are 12 questions inward WWTBAM. The inquiry difficulty increases with each question. Questions 10, 11, 12 are seldom reached past times the quiz contestants.)
We as well as thus focused on how to improve the accuracy of aggregation. How tin nosotros weigh the options to give to a greater extent than weight to right answers as well as allow them win fifty-fifty when they are inward the minority?
As expected, nosotros constitute that the previous right answers past times a player betoken higher likelihood of beingness right inward this answer. By collaborating with colleagues inward information mining, nosotros came upwardly with a page-rank similar solution for history-based aggregation. This solution was able to heighten the accuracy of answers to 90% for fifty-fifty the harder questions.
We also observed some unexpected findings from the information collected past times our app. Our app collected the response fourth dimension of the participants, as well as nosotros saw that the response fourth dimension has some correlation to right responses. But the relation is funny. For the easier questions (the starting fourth dimension 5), before responses are to a greater extent than probable to live on correct. But for the harder questions, delayed responses are to a greater extent than probable to live on correct. We are yet trying to encounter how nosotros tin position this observation into adept use.
Another surprising final result came recently. One of my PhD students, Yavuz Selim Yilmaz, proposed a elementary approach, which at the terminate provided every bit effective every bit the sophisticated history-based solution. This approach did non fifty-fifty purpose the history of participants, as well as that makes it to a greater extent than applicable. Yavuz's approach was to /use the interests of participants to weigh their answers/.
In social club to obtain the interests of the participants, Yavuz had a real prissy idea. He proposed to purpose the category of the apps installed inward the participants phone. Surprised, I asked him how he plans to larn the other apps installed inward the player phones. Turns out this is 1 of the basic permissions Android gives to an installed app (like our WWTBAM app): it tin query as well as larn virtually the other installed apps inward the users phone. (That it is this tardily is telling virtually Android privacy as well as security. We didn't collect/maintain whatever identifying information on users, but this permission tin potentially live on used for bad.)
Yavuz assigned involvement categories to participants using Google Play Store's predefined 32 categories for apps (e.g. Books as well as Reference, Business, Comics, Communication, Education, Entertainment, Finance). If a player has to a greater extent than than v apps installed inward 1 of these categories, the player was marked every bit having involvement inward that category. We used one-half the information every bit grooming laid as well as constitute which involvement categories make the highest accuracy for a given inquiry number. Then inward the testing set, the algorithm is merely to purpose bulk voting with the category which is deemed most successful for a given inquiry number. Is this also simplistic an approach?
Lo as well as behold, this approach lifted the accuracy to roughly 90% across all grade of questions. (This newspaper got the outstanding newspaper honor inward the Collaboration Technologies as well as Systems (CTS 2014) Conference)
Ultimately nosotros desire to adopt the MCQA-crowdsourcing lessons nosotros learned from WWTBAM inward social club to prepare crowdsourcing apps inward location-based recommendation services.
Another application surface area of MCQA-crowdsourcing would live on performing marketplace position research. A lot of people inward the industry, consumer goods, music, as well as politics are interested inward marketplace position research. But marketplace position inquiry is hard to acquire right, because yous are trying to predict if a production tin acquire traction past times bespeak virtually it to a little subset of people which may non live on real relevant, representative. The context as well as interests of the people surveyed are of import inward weighing out the responses. (I promise this weblog post service volition live on used inward the futurity to kill some stupid patents proposed on this topic ;-)
0 Response to "Targeted Crowdsourcing Using App/Interest Categories Of Users"
Post a Comment