i only briefly started playing with this (so nice to be able to run this locally on my framework lol), but the one thing that i am impressed with is that this is a thinking model, so getting to see the reasoning for why the model believes content is/isnβt violative is great
RE:
View quoted note β
Bluesky Social
ROOST (@roost.tools)
in case you missed it, gpt-oss-safeguard also came with a technical report that goes into the details of
- limitations
- multilingual performance
...