|
Post by tinfoilhat on Jun 7, 2008 10:20:18 GMT 10
Well, I might as well try again.
What do people think about adjudicator feedback at: a) IVs, and b) School competitions
There are several forms of adjudicator feedback I'd be interested in. The first being adjudicator feedback forms (anonymous ones obviously) with enough teeth to provide a real mechanism for feedback. The second would be something like the AUDC system of 360 feedback, feedback from the other adjudicators on your panel, as well as both teams.
The AUDC system then requires this to be used as the basis for breaking adjudicators. I am lukewarm on some aspects of this, but it seems like it would be a very good idea to institute in future. Too much of the current system seems to be based on backroom whsipering, and I think the overall quality is probably being compromised.
|
|
|
Post by Hornblower on Jun 9, 2008 9:00:16 GMT 10
Hi tinfoil,
(Just want to stress I'm speaking here with my own voice, not as convener of the Monash 2009 bid)
I'm not sure if you've been to the last few Australs and Worlds or not, but I think that there has been a very strong move towards rigorous adjudicator testing and feedback that ensures judges are promoted on merit.
Both Australs and Worlds now feature blind-marked adjudicator tests at the start of the tournament, to ensure that talented new adjudicators are not overlooked. The test is of course only a single debate, and so is coupled with a c.v. assessment. This seems to be the fairest way to begin. Within the tournament, both teams are given scored feedback sheets to fill out on their judges, and the chair judge on the panelists, and these results are tracked in the adjudication database.
Certainly I believe in a system of comprehensive adjudicator feedback, because that is the only way that the adjudication core can really find out about the talent of the judges that they have working for them.
However, there are some issues in basing the adjudicator break entirely upon a numerical value, not least because some teams will occasionally feel irrationally hard done by and slam the judges. Thus, we believe that discretion is still required in the hands of the adjudication core to weigh up and best utilise the feedback that is coming in.
So basically, maximum feedback and testing is a crucial part of identifying the best adjudicators, but the reason we have an adjudication core is precisely so that they can be trusted to make crucial decisions on a wholistic range of factors.
Cheers,
--Victor
|
|
|
Post by smartarse on Jun 11, 2008 14:44:52 GMT 10
Since tinfoilhat has tried so dilligently to engage the members of this board in a discussion on these issues, i will give my 2 cents.
I generally agree with the sentiments of Victor's post. Assessment is good and important, but the adj core really only have 2 roles - topics and adjudicator allocation. Thats their primary purposes and we should choose individuals and bids that have sufficently strong adj cores to perform those roles.
As someone who has been a DCA at both Australs and Worlds i can speak with some experience about the pros and cons of intensively assessment of judges.
New Zealand Australs (of which i not a DCA) adopted a very structured and comprehensive set of assessment protocols and relied on them very heavily to determine the break and judge allocation.
The UT Mara adj core (of which i was a member) revied the materials used in NZ and decided in the end to modify both the content and the usage of the assessment. This was for two reasons.
1) written assessment of judges skills (tests) are very difficult to construct fairly once they stray beyond simple assessment of the rules, and a discussion of winning and losing a test debate. In NZ the adj core made a valient effort to test judges high order reasoning skills, by posing scenarios and asking them how they thought they would likely award the debate.
I was strongly opposed to this approach because adjudication is always the sum of the whole debate and rarely comes down to one issue (such as how a team responded to a contradiction, or a definitional challenge) and it completely excluded manner from the considerations. For those reasons we did not include such a questions in our test.
2) assessing whether a judge is fit for the finals is a very difficult job, and requires more than a 360 feedback form.
At Vancouver Worlds the adj core (of which i was a member) adopted a policy of cross-review of all potential breaking candidates (identified by feedback scores and comments made to us). Every judge who scored over a certain threshold was placed on a panel with one of the DCAs and/or a very senior judge, and then if their feed remained strong, we repeated the placement with a different DCA so that when i came time to decide the break we had at least 2 DCAs and often the written feedback of senior judges to inform our decision.
It was an extremely effective and constructive process, and it allowed us to make quite difficult choices about people who either had strong reputations or had broken previously, or who their could conceivably have been personal conflicts of interest.
In the end i think we had a very impressive and diverse pool of breaking judging. But had we simply relied on the feedback and treated the break as a mathematical equation i don't believe the outcome would have been as good.
So while i respect the motives of those who would like a more predictable and seemingly transparent approach, i think if the DCAs are as good as they should be, then we should trust them.
That raises the question of how adj cores are constructed, but unfortunetely i don't have time to discuss that at the moment.
Cheers
|
|
jb
Going to Easters
Posts: 5
|
Post by jb on Jun 17, 2008 13:25:17 GMT 10
Hi all,
I haven't read this forum (or any debating forum!) for a long time. The discussion here is quite interesting.
The redoubtable TFH first asked some questions about adjudicator feedback in schools debating. I am well placed to comment here, as the person who's in charge of this in the DAV at the moment.
We rely a little on questionnaire-type feedback, but not much. Since about 2001 we've experimented with various methods, but now we distribute a sheet of brief questions to adjudicators at random. (NB that our competition is much too large to get feedback on everyone - if we did this, this would be the only task the staff would get to do!) This has several disadvantages:
1. not everyone is given feedback (and adjudicators can choose not to hand the sheets to the debaters); 2. most feedback is given to more prolific adjudicators, whom we normally know fairly well; 3. the debaters are often very ill-placed to comment on adjudicators, and their comments are based more on the style of the adjudication than on its content.
However, the questionnaires are useful as a means of making the debaters and teachers feel a little more empowered to comment on adjudication, and in various cases they are useful in promoting adjudicators.
Our second means of adjudicator feedback is through a formal complaint system. This is effective in pointing out "basket case" adjudicators (which is normally obvious anyway!), but it provides less useful feedback than one might imagine. Often the targeted adjudicators are very new adjudicators who make technical errors, or more experienced adjudicators who adjudicate in a way unsuited to schools debating. We know these trends occur already, so the complaints don't really help us very much. Of course a number of the complaints are purely vexatious, and normally the adjudicator is less biased and more competent than the complainer anyway.
We experimented with 360-style feedback forms one year in finals, where we have panel adjudications: panels commented on the chair and vice versa. We didn't get very much useful information. This doesn't really work outside the hothouse environment of a tournament.
Our best means of feedback is when adjudicators watch other adjudicators. We have had a "piggybacking" system in place since at least 1998, when I did the DAV training. Gradually we have formalised this. There are formal reports from senior adjudicators, and we keep comments on new adjudicators in a (Google) spreadsheet, which allows us to follow up problems more effectively than before. Unfortunately, we can only do this relatively infrequently, but it's very useful for adjudicators who have just been trained.
I know this has been long-winded, but I just wanted to re-emphasise some of the points that Tim's made about about IVs:
- numerical feedback can only be so effective; - complaint-based feedback has many problems; - qualitative/personal/anecdotal feedback is far more useful than quantitative feedback.
Hopefully the DAV, which is a really big competition, can give some empirical pointers to IV adjudication administrators.
As Tim also points out, the question of formation of an "adjudication core" is important. In an ongoing competition (as opposed to a tournament) we have the advantage that we can use previous feedback to change that core; that is, by reranking adjudicators. Our biggest quandary has been what type of experience we should take into account. We have to balance the importance of non-DAV adjudication experience with the need to tailor adjudication to the needs of schools debaters (not being too harsh, making it intelligible to them, without too much jargon!). We are gradually veering towards taking external experience more and more into account (this caused some bad blood in the past, but we are continuing to work on it), but it's still an issue for debate within the organisation. IVs encounter similar issues, I am sure. Past history is not always very useful.
TFH appears to be from Sydney, so it would be interesting to hear what s/he has to say about adj feedback in the smaller, but better resourced, elite competitions there.
Cheers,
Jonathan Benney VP (Adj & Training) DAV
|
|
|
Post by tinfoilhat on Jun 17, 2008 14:31:37 GMT 10
I'm not sure how I feel about 360 outside of a tournament environment. There are too many problems with it probably. In theory it's a very good idea, but when everyone knows everyone else, at least to some degree, the chances of 360 being useful is small. It's particularly difficult if you're piggybacking at the same time.
I don't agree so much that you can't give feedback forms at all the rounds for something like a schools comp (or even a large IV). I remember past Worlds using feedback forms in every round, and they have more debates and adjudicators than any schools comp. I really doubt there isn't time for the local school officers to go over the answers, when DCAs can do it in a day or two.
I agree, a written test or written feedback will only be so useful (weed out nuts, tell who is really good, and really bad, etc), but it can be used to pick up broad trends, and it also seems a good way of picking up important factors like who is doing well, and who is doing poorly. That, in conjunction with piggy backing, seems like a good measure of testing quality.
Does the DAV comp have an independent 3rd party look over the feedback forms? I'd think one would need that, to prevent fears of retaliation to teams who give bad feedback. Even if it is done anonymously, it wouldn't be hard for someone getting one rounds results to deduce who the complainers were. As I think I've made clear, anonymity is good generally.
The bigger the IV, the more credible the adjudicators who break seems to be, but there still seems to be serious problems of credibility. I won't name names obviously, but I think everyone can think of some adjudicators who broke, and some who didn't, and were disappointed in the decision. It seems to be a product of 3 unfortunate factors: 1) Status, 2) Longevity on the IV circuit, and 3) Tokenism. I understand the motives behind all 3, but I think if you haven't debated on the IV circuit for a certain number of years, maybe you should have to do more than just turn up to break. Maybe some sort of score based mechanism on feedback to justify the good standing of the tenured adjudicators? Likewise, I understand why the break needs to be balanced and representative, but my general feeling is it has gone too far. Any suggestions for fixing this? I agree, a solution is difficult, and I think AUDCs method is an attempt to be fair about it. It seems to work pretty well for them, would a modified version of this really be problematic for other IVs? I'm not sure it is. There could be several benchmarks for breaking, and one could be numerical. If an adjudicator doesn't rank in a certain % of the adjudicators, that could be a disqualifying factor. Alternatively, maybe every institution could send a preferential list of adjudicators from their region (their own adjudicators excluded) which could then go to working out whether these people do have standing. If it was done anonymously (given to independent DCAs maybe), and submitted anonymously by each head of contingent into a locked box, it could work.
Maybe there should be an approval process for DCAs similar to the AUDC one as well? I don't really see a problem with this, though it veers into my other topic.
|
|