​​​​​​​​​​​​​​​​​         

Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

OpenAI’s latest AI models have a new safeguard to prevent biorisks


Openii says it implemented a new system for monitoring its latest AI models, O3 and O4-DITo absorb related to biological and chemical threats. The system aims to prevent models from giving advice that could be instructed to implement potentially harmful attacks, According to OpenAI’s safety report.

The O3 and O4-Dom represent a significant increase in ability over the previous OPENAI models, the company says, thus representing new risks in the hands of bad actors. According to Openai’s internal standards, the O3 is more skilled in answering questions about creating certain types of biological threats. For this reason, it has created a new monitoring system that describes the company as “a thinking monitor focused on security” for the alleviation of other risks-openi.

The monitor, is adapted to the reason for Openi’s content policy, works at the top of the O3 and O4-DI. It is designed to identify the inquiries related to biological and chemical risk and refer to the model to refuse to offer tips on these topics.

In order to establish a basic value, the Openii Red team spent about 1,000 hours marking talks about “insecure” bioric from O3 and O4-DI. During the test in which the Openai simulated the “blocking logic” of his safety monitor, the models refused to respond to the risk instructions of 98.7% of the time, Openii said.

Openai admits that his test did not take into account people who could try new instructions after being blocked by the monitor, which is why the company says that he will still rely on the supervision of people.

The O3 and O4-Dom do not exceed the “high risk” of OPENAI for bioric, according to the company. However, compared to the O1 and GPT-4, Openi says that the early versions of the O3 and O4-Di have proven more useful in responding to questions about the development of biological weapons.

Graph with O3 and O4-Dom System Cards (screenshot: Openii)

The company actively monitors so that its models can facilitate malicious users develop chemical and biological threats, according to the recently updated Openi A willingness frame.

Openai is increasingly relieved of automated systems to alleviate the risks of its models. For example, to prevent GPT-4o Original General of Pictures from Creation Materials for Sexual Abuse of Children (Csam)Openii says that it is used on a explanation monitor similar to the one that the company is arranged for O3 and O4-DI.

Still, several researchers have expressed concern of the open and the priority of security as it should. One of the red partners companies, a metr, said he had a relatively little time to test O3 to a measure for deceptive behavior. In the meantime, Openii decided not to let go a Security Report for your GPT-4.1 modelwhich was launched earlier this week.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *