Google has been accused of using novices to verify Gemini’s AI answers

Photo of author

By [email protected]


There’s no arguing that AI still faces a fair number of unreliable moments, but hopefully its assessments are at least accurate. However, last week, Google allegedly issued instructions to contract employees Gemini Rating Never skip any prompts, regardless of their experience, TechCrunch Reports Based on the internal guidance I have seen. Preview shared by Google Gemini 2.0 Earlier this month.

Google has reportedly instructed GlobalLogic, an outsourcing company whose contractors evaluate AI-generated output, not to require reviewers to skip claims outside their scope of expertise. Previously, contractors could choose to skip any prompt that fell outside their expertise – such as asking a doctor about laws. The instructions stated: “If you do not have the task experience (e.g. programming, mathematics) to evaluate this prompt, please skip this task.”

Now, contractors have allegedly been instructed, “You should not skip prompts that require specialist knowledge of the domain” and that they should “assess which parts of the prompt you understand” with the added note that this is not an area they have knowledge of. Apparently, the only times contracts can be skipped now is if a significant piece of information is missing or if it contains harmful content that requires specific consent forms for evaluation.

One contractor responded appropriately to the changes, saying, “I thought the point of the skip was to increase accuracy by giving it to someone better?”

Shortly after this article was first published, Google provided Engadget with the following statement: “Raters perform a wide range of tasks across many different Google products and platforms. They provide valuable feedback on more than just the content of answers, but also on style, formatting and other factors. The ratings they provide do not directly impact our algorithms, but when taken in Overall, it’s a useful data point to help us measure how well our systems are working.”

A Google spokesperson also noted that the new language shouldn’t necessarily lead to changes in Gemini’s accuracy, because they’re asking raters to specifically rate which parts of the prompts they understand. This could be providing feedback for things like formatting issues even if the evaluator does not have specific expertise in the topic. As the company indicated This week’s release of FACTS Grounding It can verify LLM responses to ensure that they are “not only factually accurate with respect to the specified inputs, but also detailed enough to provide satisfactory answers to user queries.”

Updated, December 19, 2024, 11:23 a.m. ET: This story has been updated with a statement from Google and more details about how its ranking system works.



https://s.yimg.com/ny/api/res/1.2/lb8qR2gvLn2F0zzSSgGXVw–/YXBwaWQ9aGlnaGxhbmRlcjt3PTEyMDA7aD02NzU-/https://s.yimg.com/os/creatr-uploaded-images/2024-12/6d5c6b80-b7e2-11ef-af79-468572461b70

Source link

Leave a Comment