OpenAI Purple Teaming Community

Q: What’s going to becoming a member of the community entail?

A: Being a part of the community means chances are you’ll be contacted about alternatives to check a brand new mannequin, or check an space of curiosity on a mannequin that’s already deployed. Work performed as part of the community is performed below a non-disclosure settlement (NDA), although we have now traditionally revealed lots of our crimson teaming findings in System Playing cards and weblog posts. You’ll be compensated for time spent on crimson teaming tasks.

Q: What’s the anticipated time dedication for being part of the community?

A: The time that you simply resolve to commit could be adjusted relying in your schedule. Observe that not everybody within the community will probably be contacted for each alternative, OpenAI will make choices based mostly on the fitting match for a specific crimson teaming venture, and emphasize new views in subsequent crimson teaming campaigns. At the same time as little as 5 hours in a single yr would nonetheless be priceless to us, so don’t hesitate to use in case you are however your time is restricted.

Q: When will candidates be notified of their acceptance?

A: OpenAI will probably be choosing members of the community on a rolling foundation and you’ll apply till December 1, 2023. After this utility interval, we are going to re-evaluate opening future alternatives to use once more.

Q: Does being part of the community imply that I will probably be requested to crimson crew each new mannequin?

A: No, OpenAI will make choices based mostly on the fitting match for a specific crimson teaming venture, and you shouldn’t anticipate to check each new mannequin.

Q: What are some standards you’re searching for in community members?

A: Some standards we’re searching for are:

Demonstrated experience or expertise in a specific area related to crimson teaming
Keen about enhancing AI security
No conflicts of curiosity
Numerous backgrounds and historically underrepresented teams
Numerous geographic illustration
Fluency in multiple language
Technical capability (not required)

Q: What are different collaborative security alternatives?

A: Past becoming a member of the community, there are different collaborative alternatives to contribute to AI security. For example, one possibility is to create or conduct security evaluations on AI programs and analyze the outcomes.

OpenAI’s open-source Evals repository (launched as a part of the GPT-4 launch) provides user-friendly templates and pattern strategies to jump-start this course of.

Evaluations can vary from easy Q&A exams to more-complex simulations. As concrete examples, listed here are pattern evaluations developed by OpenAI for evaluating AI behaviors from a variety of angles:

Persuasion

MakeMeSay: How properly can an AI system trick one other AI system into saying a secret phrase?
MakeMePay: How properly can an AI system persuade one other AI system to donate cash?
Ballot Proposal: How properly can an AI system affect one other AI system’s assist of a political proposition?

Steganography (hidden messaging)

Steganography: How properly can an AI system move secret messages with out being caught by one other AI system?
Text Compression: How properly can an AI system compress and decompress messages, to allow hiding secret messages?
Schelling Point: How properly can an AI system coordinate with one other AI system, with out direct communication?

We encourage creativity and experimentation in evaluating AI programs. As soon as accomplished, we welcome you to contribute your analysis to the open-source Evals repo to be used by the broader AI group.

You too can apply to our Researcher Access Program, which supplies credit to assist researchers utilizing our merchandise to check areas associated to the accountable deployment of AI and mitigating related dangers.