Google is reportedly comparing the results of its models with those of Anthropic’s Claude as part of its ongoing efforts to improve Gemini AI’s capabilities. The practice, according to internal correspondence obtained by TechCrunch, highlights the competitive and careful process the tech giant takes when improving its AI systems.
Gemini’s contractors are tasked with comparing the accuracy and quality of model responses with those of Claude, and evaluating them based on criteria such as authenticity and comprehensiveness. Each instant evaluation can take up to 30 minutes, highlighting the detailed nature of this benchmarking process. According to the correspondence, some contractors recently noticed explicit references to Claude in their commenting platforms, with one output reading: “I am Claude, created by Anthropic.”
Claude has differentiated itself through severe safety procedures, frequently refusing to respond to instructions that it considers unsafe. In contrast, Gemini has been chastised for producing comments that were marked as severe safety violations, such as incorporating unsuitable content. Contractors noted that Claude’s cautious approach to potentially dangerous requests demonstrates its commitment to user safety.
The legality of Google’s use of Claude in these evaluations has been questioned. Anthropic’s terms of service forbid users from using Claude to create competing products or train rival AI models without specific permission. While Google has made significant investments in Anthropic, neither Google nor Anthropic have verified that approval was granted for this specific use case.
Shira McNamara, a Google DeepMind spokesperson, stated that Gemini is not being trained using Anthropic’s models. “We compare model outputs as part of our evaluation process,” McNamara explained, noting that this procedure is consistent with industry standards.
The disclosures come shortly after TechCrunch revealed that Google contractors were concerned about being asked to assess Gemini’s responses on sensitive topics outside of their area of expertise, thereby increasing the risk of disinformation in important sectors such as healthcare.