Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Wednesday, 9 May 2018

An intriguing review system in need of further testing and refinement

You may remember that a few weeks ago I described this UAE "Space Settlement Challenge" seed grant. I was co-I on a proposal which was not successful - we were ranked 74th out of 260, which isn't bad going, but not enough to make the cut. There are still quite a lot of things I like about this funding system, and a few I don't.

The basic idea is that, after some initial rejection of illegible proposals (e.g. not written in English and relating to the topic at hand), everyone who submits a proposal is also a reviewer. Each proposal is randomly assigned three reviewers in a double-blind system (neither side know who the others are - all they see is the proposal and the reviewer comments). Proposals have to follow a template answering in 200 words (this is only a very small grant, and not a big deal at the end of the day) for each section, which are as follows :

- Summary : Write a brief non-technical description of the project or research, work to be done, and significance of your idea.
- Previous work : Describe what is known about the topic and what research has been conducted previously, if applicable.
- Impact : Describe what is new and innovative about this project or research, what is the broader impact, who and how it will benefit.
- Aims : Describe in detail the specific objectives of your project or research.
- Methods : Provide a detailed work plan of the activities you plan to carry out.
- Budget.

All three reviewers assign a score of 1-6 for each section (6 is the highest) along with comments. Then the scores for all proposals are sorted, the very highest are straightaway accepted, the worst are rejected, and the middle ground get a chance to re-submit a refined version, accounting for the feedback they received. Then those proposals get reviewed again.

This is an extremely fast way to evaluate a large number of proposals, and has some attempt to reduce reviewer bias thanks to anonymity. The refinement stage in particular is an excellent idea, because it's usually extremely frustrating to receive reviewer's comments without being able to respond to them.

What needs some considerable work here, however, is clarity. For example, in this case there was enough money available for about 30 fully-funded proposals up to the maximum budget, but presumably a few more if the best proposals asked for less than the maximum permitted funds. But 260 >> 30, so is there any need for refinement ? There were surely more than 30 highly scored proposals here, so you could select enough to consume all the available funding anyway. Also, inside the system there's a "cutoff" value (37), but it's not clear what it means. My guess would be that it means 37 proposals were funded. Perhaps they anticipated a much smaller competition pool and would fund all proposals above a certain score (after refinement) and reject any below this even if they still had available funds.

Also, the description of what's expected is far too short and open to misinterpretation. For example, we presumed that the methods section referred to logistics - how we'd collaborate as a group, communicate, etc. Two of the referees basically agreed, but the third interpreted this to mean scientific methods - and justifiably gave us a very low score.

The other issue is the amount, if any, of external oversight. It's implied that there must be some, since proposals are first confirmed if they're eligible before proceeding to the review section. Yet one of the proposals we reviewed wasn't about space at all (it didn't even mention the word !) and had references to figures that didn't exist. It was clearly garbage and never should have made the initial cut. And while getting everyone to review each other is an interesting idea, is anyone reviewing the reviewers ? Two of ours were overall positive but the third wholly negative - even on the extremely clear and detailed budget section. Someone ought to check if some reviewers only give low or high scores

Oh well, there's tonnes of other projects to be getting on with anyways.
https://www.guaana.com/funding/grants/mbrchallenge/details

17 comments:

  1. I'm curious. I also applied, and got an initial email after the 1st review period that said, essentially, "sorry, not funded, good luck". But then there was another email that said if your ranking was higher than the cutoff, you are in for the 2nd reviewing round. My ranking was 52, and the cutoff was 37. But since then nothing has appeared on the site nor have I gotten any more emails. Perhaps by "higher" they mean "lower", like "1==perfect". Their communications have been very confusing.

    ReplyDelete
  2. Since the number of available grants was ~30, depending on the amounts requested, I'd guess they mean anything below the cutoff. Interesting that there still even is a cutoff, given the number of proposals... I wonder how high/low you have to rank to be guaranteed funding ? It'd odd that they don't mention these basic numbers.

    I get the impression that this approach is still itself experimental... nice website, but missing some important details. Still, good luck Dean Calahan !

    ReplyDelete
  3. Rhys Taylor it didn't help that two of my reviews were actually stupid [1]. Like, wanting me to include info (inappropriately, IMO) in one section that had already been placed, appropriately, in another. I would have liked to be able to dispute reviewer comments [2].

    [1] Yes, I know it's traditional to think that about one's reviewers; also, it often seems true.

    [2] This is also traditional. Yes, I do know better than to say "…evidently the reviewer did not actually read the proposal…".

    ReplyDelete
  4. Dean Calahan
    Hi Dean,
    My name is Ain and I'm one of the people behind the platform and funding system. Your criticism is fair, our communication has much room for improvement.

    To clarify your question and our platform, ranking is not a score but a position in the overall standing. 37 was the cut-off point, meaning available grant fund has been exhausted. Your proposal ranked 52, meaning it is positioned 15 places below the cut-off point. Unfortunately this means that you did not continue to the refine round.

    I would be happy to hear your preference in terms of communicating the ranking - should there be more information, or was it confusing because it was next to your proposal score?

    We are using the ranking or position for the cut-off, because proposals with the same score can (and in the cased of MBR Challenge did) populate this marker.

    ReplyDelete
  5. Rhys Taylor

    > It'd odd that they don't mention these basic numbers.

    Our current plan is to make all the scores and rankings public after the Peer-Review 2 has been completed. But you are right, maybe a pseudo ranking table (e.g. without proposal titles) would help to clarify how your proposal ranked compared to others and how subsequent roles are defined.

    ReplyDelete
  6. Dean Calahan Rhys Taylor

    Thank you for opening this discussion and pushing us to do better.

    ReplyDelete
  7. Thanks very much for the feedback Ain Kuuseok ! That helps a lot. I have a few other questions if you don't mind... I do appreciate that it may not be possible or permitted for you to answer these just yet.

    1) If I understand it correctly, some of the proposals equal to or better than rank 37 are either guaranteed funding already or have a chance at refinement, while those ranked worse than 37 are not funded. I wonder, is it possible to state how many proposals are guaranteed and how many now enter the refinement stage ?

    2) Is there any check on the reviewers to see if anyone is trying to game the system, e.g. always giving positive or negative reviews regardless of the quality of the proposals ?

    3) Can you give any more details on the initial decision to accept a proposal into the system ? I imagine the check that they're in English can be done automatically, but is there anyone reading the proposals to confirm they're related to the topic (if only the titles) ?

    Thanks again... it really does help to know that someone is considering this. :)

    ReplyDelete
  8. Happy to help.

    1) This is how the ranking table would look at the moment. Numbers represent the position.

    Winners: 1
    Reviewers in PR2: 2 - 27
    Refiners: 28 - 40

    The logic here is following: for every proposal below the cut-off to be included in the refine group, we have to match it with a proposal above the cut-off. Otherwise there would be no budget left for the refine group. For every refiner, there should be two reviewers, meaning the PR2 group has to be twice the size of the Refiners group.

    Since all of you are very sharp, you noticed that the numbers we have provided don't match the above formula. The actual cut-off is 35. This discrepancy is due to 7 proposals having a same score at the cut-off marker and the way our interface calculates this. Once again I have to humbly admit our shortcoming. This is our first real life pilot and I promise we will sort these issues out for the next funding call.

    I will come back to answer the two remaining points later today.

    ReplyDelete
  9. 2) Is there any check on the reviewers to see if anyone is trying to game the system, e.g. always giving positive or negative reviews regardless of the quality of the proposals ?

    When we set out to design our funding model, we started with a conviction that worlds smartest people can collectively and autonomously decide amongst themselves which projects should be funded. Hence our core principle, no outside influence. But this also means, that we as an intermediary remain neutral.

    Achieving this goal won't come without mishaps, as the first pilot has demonstrated. We anticipate that there will always be participants who neglect best conduct. To counter this, all reviews can be made public and meta-evaluated by the public, including the author of the proposal. If someone deliberately gives unjust reviews, they can be called out publicly, their reviews flagged and ultimately their public profile becomes tainted, making them ineligible for future funding. So in essence, our model is built on the principle of public accountability.

    ReplyDelete
  10. 3) Can you give any more details on the initial decision to accept a proposal into the system ? I imagine the check that they're in English can be done automatically, but is there anyone reading the proposals to confirm they're related to the topic (if only the titles) ?

    Yes, all proposals were screened (read) by people. We have also learned our lessons here and will make the guidelines more explicit for new funding calls.

    ReplyDelete
  11. Ain Kuuseok, I was initially confused by the term higher ranking. To me that meant that if the ranking score was higher than some cutoff, I was included. The fact that the ranking score was higher meant that the ranking was lower was not initially apparent.

    I also felt that allowing extra time for people who didn't get their work in on time penalized the people who did get their work in on time. To me that was a big mistake. It really sent the wrong message and generated quite negative feelings from me, to the point that I have requested that my profile be deleted. If there's going to be a double standard, I don't want to participate.

    I think I've already mentioned that not being able to dispute reviewer comments is also a mistake. Perhaps the initial review period could somehow take that into account, and all proposals go through two reviews, allowing revisions to adapt to reviewer feedback, and some oversight on your part to adjudicate differences.

    ReplyDelete
  12. Ain Kuuseok Wow, thanks ! I was really just expecting this to be a small rant about some quirks in an interesting proposal system... I really didn't think anyone was actually listening :) I give you serious kudos for responding thoroughly, promptly and honestly to my comments. I also appreciate that you're thinking carefully about the issues involved.

    For me the best feature of this is the chance of refinement - we almost never get to do this in astronomy at all. If that could be extended, that would be great. Of course this places an extra burden on the reviewers though. One final question : does the refined review get sent to the same reviewers or different ones ?

    ReplyDelete
  13. Rhys Taylor

    Happy to be part of the conversation.

    > One final question : does the refined review get sent to the same reviewers or different ones ?

    Different reviewer. Although, the idea of re-evaluating has been considered. But for this pilot at least we want applicants to get a wider spectrum of feedback.

    Thank you for sharing your experience. We will definitely look for a way to bring refinement to a wider group.

    ReplyDelete
  14. Dean Calahan Yes, there were others who also considered the ranking to be a score. So it is clear that it was a shortcoming of our interface and we are already working on improvements.

    Thank you for sharing your experience, Dean. Nothing I say will change this for you. But I am grateful that you took the time to express your concerns. We do take everything you have said seriously and the only thing I can offer you is assurance that we are looking for best possible ways to facilitate your suggestions. Even though it sounds like a cliché, we are building this funding model for you, to serve you, the scientists.

    I understand your wish to delete your account. But you are welcome back anytime and I do hope that you will give us another chance in the future. I will keep you posted here on how we plan to solve these issues in the future.

    ReplyDelete
  15. Rhys Taylor Dean Calahan One easy way to allow for a larger refiner group would be to ask the reviewers conduct more evaluations. Currently reviewers in PR2 are asked to provide 1 more evaluation. In your experience, would you be willing to do 2 reviews, given the time and effort that goes into giving constructive feedback?

    ReplyDelete
  16. If I see another call that I feel interested in, I will definitely consider it. I find the time spent trying to come up with a good proposal with such a low probability of success to be a luxury. Although in a sense proposal writing time is never exactly wasted. Plus, I finally got to use the ORCID account I had forgotten I even had.

    The main reason I wanted my account deleted is that I have so many accounts "out there" anyway; even the active ones are a psychological burden and inactive ones make it worse. No doubt there is a bit of unwarranted petulance involved, but even if I didn't have somewhat hard feelings I would probably have asked for it to be deleted anyway.

    ReplyDelete
  17. Hi Ain Kuuseok Sorry this one slipped off my radar for a bit.

    I took over from the PI of our proposal while he was travelling, so I wasn't involved in the reviewing stage. However, I've seen the other reviews and the feedback forms, so in my opinion a couple of extra reviews is not an enormous extra burden.

    ReplyDelete

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.

Review : Pagan Britain

Having read a good chunk of the original stories, I turn away slightly from mythological themes and back to something more academical : the ...