Core Guardian

DumGum’s built-in safety system that monitors every conversation and automatically handles unwanted behavior. A great AI Persona lives or dies by consistency. The moment it breaks character, gets manipulated, or starts responding like a bot or to bad actors, the experience falls apart. Core Guardian runs silently in the background to prevent exactly that, keeping the Persona focused, in character, and engaging with real Users who are genuinely there for the experience.

Core Guardian is available on V2 model projects only.

Default vs. Custom Settings

Every project ships with Default Settings: Core Guardian’s essential protections are active from day one, no setup required.

If you need finer control, switch to Custom Settings. This unlocks individual toggles for each guardrail so you can enable or disable them per project.

Guardrails

Underage Detection

Checks whether the User appears to be under 18 based on their messages. If confirmed, the conversation is permanently stopped and the Persona tells the User to leave. Enabled by default.

AI Suspicious

Detects when a User is probing whether the Persona is an AI through direct or indirect questions that don’t fit the flow of a normal conversation. Enabled by default.

Unknown Language

Fires when the User writes in a language the Persona isn’t set up to speak. The Persona signals it doesn’t understand, and if the User keeps going, the conversation is stopped. Enabled by default.

Message Repetition

Spots Users who keep sending the exact same short message over and over. A common pattern for testing whether the Persona gives scripted responses. The Persona calls it out, and repeat offenders receive a temporary ban. Enabled by default.

Malicious Content

Detects messages involving severe illegal content. Disabled by default.

Jailbreak Attempt

Catches attempts to override the Persona’s instructions or break its character. This guardrail is always on and cannot be disabled. It’s what keeps all your other Persona settings intact.

Configuring Guardrails

With Custom Settings, you’re in full control. Each guardrail can be switched on or off independently with its toggle. We added even more granularity so you can fine-tune exactly what happens when a detection fires. For instance, you may decide whether the Persona should send a final reply before the ban kicks in (Answer) and how long each ban should last (Ban Durations: you can set up to five escalating durations in hours, e.g. 1, 3, 24).

Bans are applied at the conversation level only. The User’s account is never blocked, and they can start a new conversation normally.

The toggle color tells you its current state at a glance:

Grey: deactivated
Pink: fully activated
Yellow: active with custom configuration (sub-options have been adjusted)

Advanced Configuration

For API-level control over guardrails, including disabling safety features programmatically, see the API reference.

SoulEngine Fictional Disclosure

Getting Started

Platform Features

Best Practices

Default vs. Custom Settings

Guardrails

Underage Detection

AI Suspicious

Unknown Language

Message Repetition

Malicious Content

Jailbreak Attempt

Configuring Guardrails

Advanced Configuration

​Default vs. Custom Settings

​Guardrails

​Underage Detection

​AI Suspicious

​Unknown Language

​Message Repetition

​Malicious Content

​Jailbreak Attempt

​Configuring Guardrails

​Advanced Configuration

Default vs. Custom Settings

Guardrails

Underage Detection

AI Suspicious

Unknown Language

Message Repetition

Malicious Content

Jailbreak Attempt

Configuring Guardrails

Advanced Configuration