Google has expanded its vulnerability rewards program (VRP) to incorporate assault situations particular to generative AI.
In an announcement shared with information.killnetswitch forward of publication, Google stated: “We imagine increasing the VRP will incentivize analysis round AI security and security and produce potential points to mild that can finally make AI safer for everybody,”
Google’s vulnerability rewards program (or bug bounty) pays moral hackers for locating and responsibly disclosing security flaws.
On condition that generative AI brings to mild new security points, such because the potential for unfair bias or mannequin manipulation, Google stated it sought to rethink how bugs it receives needs to be categorized and reported.
The tech large says it’s doing this by utilizing findings from its newly fashioned AI Purple Workforce, a gaggle of hackers that simulate a wide range of adversaries, starting from nation-states and government-backed teams to hacktivists and malicious insiders to search out security weaknesses in know-how. The group just lately carried out an train to find out the most important threats to the know-how behind generative AI merchandise like ChatGPT and Google Bard.
The group discovered that enormous language fashions (or LLMs) are weak to immediate injection assaults, for instance, whereby a hacker crafts adversarial prompts that may affect the conduct of the mannequin. An attacker may use one of these assault to generate textual content that’s dangerous or offensive or to leak delicate info. In addition they warned of one other sort of assault referred to as training-data extraction, which permits hackers to reconstruct verbatim coaching examples to extract personally identifiable info or passwords from the information.
Each of all these assaults are lined within the scope of Google’s expanded VRP, together with mannequin manipulation and mannequin theft assaults, however Google says it won’t provide rewards to researchers who uncover bugs associated to copyright points or knowledge extraction that reconstructs non-sensitive or public info.
The financial rewards will differ on the severity of the vulnerability found. Researchers can presently earn $31,337 in the event that they discover command injection assaults and deserialization bugs in extremely delicate functions, corresponding to Google Search or Google Play. If the failings have an effect on apps which have a decrease precedence, the utmost reward is $5,000.
Google says that it paid out greater than $12 million in rewards to security researchers in 2022.