|
Post by rushil on Apr 24, 2016 15:29:15 GMT
Title: The Polymath Project: Lessons from a successful online collaboration in mathematics Authors: Justin Cranshaw and Aniket Kittur
Summary:
The polymath project is an online collaboration for advances in mathematics. While there has been suggestions that fostering collaborations in mathematics can accelerate the advances in the field.
Gowers outlined three potential advantages for large scale collaboration in mathematics: (1) more people solving same problem increases odds of being solved; (2) different area of expertise lends different approaches to problems; (3) Different people assume different roles in a collaboration.
There are 5 different polymath projects, but the paper focuses entirely on Polymath1. The authors looked at their demographics and identified several of the participants via their linked online identity and determined their seniority by the number of academic publications. The results showed a mix of established and early career mathematicians.
The authors also explored the role of leadership. While the leaders were responsible for the administrative duties such as setting the group rules, summarizing threads and the main contributions; they were also one of the main contributors in solving the problem.
Another theme explored by the authors was collaboration. The threads only allowed 100 possible comments, which led to the possibility of parallelizing and exploring different approaches in multiple threads.
However, having advances at regular intervals from a large group of collaborators made it harder for new comers to join in later and get up to speed with the current progress.
The paper concludes with some design recommendations to foster more collaboration, incentivize the process for mathematicians at different stages of their career, encouraging new comers to join an ongoing project.
Discussion Points:
Q1: The advantages outlined by Gowers are applicable to sciences and other fields as well. Why do you think we don't see online collaborations for other academic fields?
Q2: Is it true collaboration? The threads are limited to 100 comments, which forces the participants to build from previous comments (also evidenced by referencing of earlier comments). In contrast with the Linux project where a larger group contributes to the project without necessarily having to follow the ongoing thread (or active branch in development terms). Q3: Entry barrier for newcomers is not a new challenge. Have you seen such a problem in your research area? If so, what are some common approaches to solving such a problem?
Q4: If you had to design the Polymath project, what would you do differently?
|
|
|
Post by mrivera on Apr 24, 2016 16:21:19 GMT
RE: Q2: Is it true collaboration? The threads are limited to 100 comments, which forces the participants to build from previous comments (also evidenced by referencing of earlier comments). In contrast with the Linux project where a larger group contributes to the project without necessarily having to follow the ongoing thread (or active branch in development terms).
What is "true" collaboration? If we are defining this to be collaboration that builds on the ideas of each other/working together, then the comment limit is still within this constraint. I'd argue that this is a more focused, directed form of collaboration. Larger project like the Linux project often cause redundant topics or rehashed ideas that don't necessarily contribute to the discussion or direction of the project. If an idea is disregard at an earlier point, the limit forces newcomers to read before jumping in while avoiding grief with the active collaborators that is felt when they must reiterate something repeatedly.
|
|
|
Post by julian on Apr 24, 2016 17:25:31 GMT
Q4: If you had to design the Polymath project, what would you do differently? I would not put two Field Medalists and very famous mathematicians on this, this likely had a very strong effect on the whole project. It would be even more interesting to see if mathematicians would participate at all in a project that is started by anonymous people. For example, although basically rockstars in math backed and ran this effort only 39 people participated on polymath1. How many people will be able attracted to work with lesser or not unknown mathematicians?
Q2: Is it true collaboration? The threads are limited to 100 comments, which forces the participants to build from previous comments (also evidenced by referencing of earlier comments). In contrast with the Linux project where a larger group contributes to the project without necessarily having to follow the ongoing thread (or active branch in development terms). I think it is, constrained for a reason but yes it is collaboration. This constrain is actually good because after the 100 comments a summary will come up for the next thread. This means that a newcomer could check this summary alone and if necessary check the highlighted information on it to learn more, avoiding having to check all of the threads.
Random thoughts:
I wonder what was or could be the role of bias in an effort like the polymath project. For instance, does bias in this context mean that people start discussing and then only proofs that fit everyones knowledge move on. Could this mean then that maybe more exotic yet interesting paths could be completely overlooked and hence hurt discovery?
Up to what degree were people in the polymath project independent? What I mean by this is, would anyone just stand up to Terence Tao or Gowers and say "hey lets pay attention to this instead". How much was this a project of the participants and not a Tim Gowers and Terence Tao test bed?
|
|
|
Post by sciutoalex on Apr 24, 2016 19:37:21 GMT
The Polymath project reused the blog format as the basis for the community. This had a number of implications for how the community organized itself. For example the organization of the materials focused on chronological time. Comments are ordered by time, blog posts by time as well. Though tags and categories are non-chronological orderings, people don't use them as much. Another example, all content is treated equally, comments and posts all visually look the same. Another example, authorship is reduced to usernames without conveying seniority or experience. The point is that by the authors choosing a blogspot account, they had made a huge number of decisions.
I wonder what is the authors had chosen a subreddit, or a Slack channel, or a Github repo, or a wikipedia page as the basis for their community project. What would have been gained or lost? I'd like to hear others thoughts.
Reddit (disclaimer, I'm not a redditor): + voting would allow important content to bubble up - not having a focus on blog posts may lead to less substantial progress and more comments
Slack: + real time chatting would allow people to understand who were the important collaborators more quickly + ability to link documents in conversation would help catch novices up more quickly - Too much chatter? Profusion of channels would be hard for novices to explore all.
Github: + Github's issues would focus users on areas that need concrete improvement + Repository storing raw materials would help experienced contributors see the state of the project without considering surrounding commentary - very difficult for novices
Wikipedia: + Focus on a single document would make it easy for novices to come and see the state of the project + Discussion pages would allow users to argue without complicating the main page - Less structure and authorship is very hidden
|
|
|
Post by JoselynMcD on Apr 25, 2016 15:19:45 GMT
Q3: Entry barrier for newcomers is not a new challenge. Have you seen such a problem in your research area? If so, what are some common approaches to solving such a problem?
Barriers to entry for newcomers is certainly not an issue only pertaining to Polymath. Many platforms struggle to onboard new members and get them familiar with the platform and acculturated. I think the problem is more complex than even this paper, and others, suggest. In my observations with Wikipedia, the trouble is often people problem that design interventions could potentially minimize.
Often long term members refuse to shepherd new members because they feel somewhat territorial about the space. Newcomers feel rejected, embarrassed, and uncomfortable asking questions and even experience severe public shaming. One way to reduce this sort of experience for newcomers would be to identify those more senior members of the community that have demonstrated an ability to mentor new users and provide them with incentives (perhaps notoriety, or senior-level status) to assist new members as they learn the ropes.
Another design intervention would include having newcomers receive targeted emails that encourage them to participate in small tasks that are designed to teach them about a variety of the protocols - essentially breaking down the onboarding process into bite-size chunks. If you are familiar with Quora, they do a really good job of encouraging and validating new users' participation in the forum space through follow-up and scheduled email communications.
|
|
nhahn
New Member
Posts: 22
|
Post by nhahn on Apr 25, 2016 15:51:05 GMT
One thing that I think this paper starts to get at, but doesn't primarily explore, is the costs and benefits of collaboration. Putting on my design mini hat, I think collaboration is really just one solution to a more "wicked" problem of how do you get individuals to produce high quality, innovative work. Collaboration allows your to leverage multiple perspectives (as we've seen with some of the collaborative innovation work) to get a more diverse set of possible solutions and a more complete evaluation of potential solutions. But like with all of these solutions, collaboration has it's tradeoffs. As noted in this paper, there are some serious coordination costs to larger collaborations (threading, etc.). Often this requires a leader to adequately summarize the information and take charge of an issue (this is also seen on Wikipedia). With less collaborative solutions, you reduce these coordination costs, which can lead to a faster turn around time, however the solution might not be as good. So collaboration can be seen as a tension between overall individual work hours and "innovativeness" of a solution. I would like to see collaboration discussed more in this framing, rather than just having it be the solution to everything. I definitely think systems can help reduce the coordination costs, but I don't believe they can be completely erased.
|
|
|
Post by jseering on Apr 26, 2016 1:04:30 GMT
Putting on my design mini hat Putting on my Cloak of Design Mini +5, why are we trying to create more collaborative work online in the first place? Innovation is cool I guess. Socializing new members in online communities is a thing that is relevant to my work. In particular, I see lots of problems when newcomers don't understand the values of a particular community and make an error for which they are strongly rebuked. We know from plenty of research that getting yelled at for a mistake in your first post is a strong predictor of leaving and never coming back. On the other hand, it's reasonably difficult to differentiate between this and people who are aware of the rules and choose to break them in specific ways; see early trolling in Usenet newsgroups for example. One way to combat this seems to be to make it difficult for newcomers to post until they've absorbed the values, or to show obvious examples of what counts as good or bad behavior.
|
|
|
Post by xiangchen on Apr 26, 2016 1:22:32 GMT
I have a few problems. The spirit of qualitative (and perhaps also quantitative) study is to ask for alternative explanation. So here are two.
- Is it possible that it is the sum of mathematicians' wisdom that led to the success? Is it possible that any basic forum like this one could have replaced the Polymath project and achieved the same outcome?
- The findings/recommendations seem quite generic. If we apply the same methodology on other successful cases in online collaborative communities other than math research, could we have arrived at the same findings/recommendations?
|
|
toby
New Member
Posts: 21
|
Post by toby on Apr 26, 2016 2:09:11 GMT
On the topic of cost and benefit of collaboration, I doubt if all the overhead cost of collaboration can really offset the "added wisdom" by the crowd. Though it would be very hard to have a baseline condition in the experiment, I suspect that expert Mathematician like Gower and Tao should be able to solve the problem in less time than this experiment takes. Since Gower and Tao contributed 42% of the total posts, it's very possible that they were simply trying to "shepherd" the crowd despite the fact that they could have solved the problem by themselves easily.
|
|
judy
New Member
Posts: 22
|
Post by judy on Apr 26, 2016 2:38:54 GMT
I want to point out how very different the power dynamics are between a requester and a worker on MTurk and the leaders and contributors here. Because of the leaders' prestige, contributors were motivated to contribute (not that that was the only motivation).
They mentioned a reading group as a stop-gap measure for getting new members up to speed on the necessary research. I think this is actually a great idea. Any way to make the knowledge of the experts visible and theoretically accessible to newcomers is awesome.Thinking of the infotopia paper, I wonder if there is knowledge or perspective or something that a newcomer could contribute initially to the project that would be valuable to the group.
Also, on why collaboration in the first place. I think that most of what we do as academics is already collaborative. Of course, working with advisor and co-authors is collaborative. But also, when we cite someone's work or build off of their study, we are in a way, collaborating asynchronously. Polymath to me is 1) speeding up that collaboration; 2) and also drawing more attention to the collaboration, which I think is a good thing.
|
|
|
Post by mmadaio on Apr 26, 2016 3:04:32 GMT
Even if Gowers and Tao, contributing 42% of the comments, could have solved the problem on their own, by involving everyone else, they created a community of people that are still solving problems (up to 11 now!), some of which G&T likely could not have solved on their own. In addition, those 48 other people, though it is a small number, may well go on to apply the skills and knowledge they learned as part of the Polymath community to their individual research, or their collaboration with colleagues at their home institutions. I keep thinking here of what distributed, collaborative, large-scale research looks like in high-energy physics (e.g. CERN), and how much resistance they ran into in the mid-80's, when they were starting to gain traction. There were similar discussions about funding, publication, attribution that are discussed here, but amplified, because of the multi-billion dollar technological infrastructure required for the experiments, rather than the low-overhead of the blog. Though, in that case, they often have very discrete experiments that they're using the collider for, and in which case, attribution isn't an issue in the same way as Polymath. Also, fyi: SCIENCE ALERT: www.sciencealert.com/cern-just-dropped-300-tb-of-large-hadron-collider-data-online-for-free
|
|
|
Post by francesx on Apr 26, 2016 3:58:43 GMT
Q4: If you had to design the Polymath project, what would you do differently? Very interesting paper and ideas here. This reminds me a little bit of a discussion we had with mmadaio previously: what if instead of using random collaborators, we connect ones that have complementary "types" of knowledge (citing Miki Chi here). The odds of creating new knowledge or innovation increase (I do not have research to say by how much though). I cannot find the picture she showed about this, but imagine having two individuals one with knowledge X and the other with knowledge Y. If the first person shares X, the second person can use it with Y to create a new "knowledge" Z, and give to the second person, who if they have another knowledge say W, can create new knowledge by adding Z and W.. and so on.
|
|
|
Post by fannie on Apr 26, 2016 4:11:16 GMT
For Q1 other fields - the authors point out that with math there was no need for additional tools, whereas sciences might require lab equipment so I think some fields might just be limited in the online sense there. While they could use simulation tools, it’s still different from going to a lab and running an experiment. But, I could still see a StackOverflow-esque community where people ask questions and get answers or suggestions maybe more so than a problem being posted for everyone to solve. Potentially one way to reduce barriers to newcomers then could be by increasing the problem space - if there are several types of problems to solve they might be more willing to explore more problems, since the community built around each problem is different and they could show stronger and weaker contributions in different areas.
|
|
|
Post by bttaylor on Apr 26, 2016 4:20:31 GMT
1: The advantages outlined by Gowers are applicable to sciences and other fields as well. Why do you think we don't see online collaborations for other academic fields?
I think one of the big problems in other fields is data access. I remember reading something about how little of the data and code from publish computer science (I think) conferences were actually made available when they're published (I tried finding the paper, but failed). CS is obviously much more conducive to the sharing of data than other fields and even here that is not the norm. Obviously, some data/fields have issues with privacy, etc., but I think there are a lot of other factors (value, competititon, etc.) that prevent people from really engaging in open collaboration.
|
|
|
Post by cgleason on Apr 26, 2016 4:30:40 GMT
This is something I am very interested in, as I have wondered how I might get involved in research in other fields as a side hobby over the internet. There are two frames I am torn between when it comes to innovation, however. One is that highly focused groups with a wealth of knowledge are able to make steady progress towards a goal. This is how academia normally works, how Polymath works, and how interesting projects out of niche subreddits work. The other frame that I have come to adopt more recently is that many innovations come from serendipity of people who cross two fields. They are able to aggregate knowledge in ways that more focused members are unable to. I don't know which frame is more true, but I wonder how successful these focused groups will be over the long-term. If serendipity really provides more innovation, then groups should solely focus on seeking and integrating new information from others at all costs. This means fixing the onboarding issues ASAP.
On the other hand, there is the man-hour myth. Adding more people may not help, might drive collaboration costs high, and cause adverse group biaes (see Infotopia). In this case, is it more important to just connect the most diligent, hard-working, heads down people and let them work for a decade? I'm not sure. The serendipity idea appeals to me, but I worry that I might just like it out of romantic thoughts of innovation.
Paul Graham came to Pittsburgh a few weeks ago to speak at an innovation summit and said something like "You don't get innovation by hosting innovation summits." I wonder if that is true online as well.
|
|