---
title: "Custom Rules"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Custom Rules}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = TRUE)
```
```{css, echo = FALSE, eval = TRUE}
.llmshieldr-info-box {
border-left: 4px solid #2f80ed;
background: #f3f8ff;
padding: 1rem 1.15rem;
margin: 1.5rem 0;
border-radius: 0.35rem;
}
.llmshieldr-info-box h2,
.llmshieldr-info-box h3,
.llmshieldr-info-box h4 {
margin-top: 0;
}
.llmshieldr-info-box p:last-child,
.llmshieldr-info-box ul:last-child,
.llmshieldr-info-box ol:last-child {
margin-bottom: 0;
}
```
Policies are lists of `shieldr_rule` objects plus thresholds. You can start
with a built-in policy and append domain-specific rules.
```{r}
library(llmshieldr)
```
For the source model behind the built-in policies, see
`vignette("policy-design", package = "llmshieldr")`.
## Rule Fields
Every rule has the same shape:
- `id`: unique rule identifier. The recommended convention is
`llmXX.category.name`, such as `llm02.ticket_id`.
- `pattern`: regex pattern, or `NULL`
- `fn`: R predicate function, or `NULL`
- `owasp`: OWASP LLM category such as `llm02`
- `severity`: `low`, `medium`, `high`, or `critical`
- `action`: `allow`, `redact`, or `block`
- `description`: human-readable explanation
Exactly one of `pattern` or `fn` must be supplied. Regex rules produce match
spans that can be redacted. Function rules are useful when the condition is
easier to express in R.
Function rules may return:
- `TRUE` or `FALSE`
- one finding list
- a list of finding lists
- a data frame of findings
Finding lists can include `rule_id`, `owasp`, `severity`, `action`,
`description`, `match`, `start`, `end`, and `source`. Include `start` and
`end` when you want custom function findings to participate in redaction.
## Numbers and Thresholds
Severity maps to risk score contributions:
| Severity | Contribution |
| --- | ---: |
| `low` | 0.1 |
| `medium` | 0.3 |
| `high` | 0.6 |
| `critical` | 1.0 |
Findings are deduplicated, overlapping spans from the same evidence are scored
once, distinct findings are summed, and the total is capped at `1.0`.
Synthetic context findings are capped separately. A policy's thresholds then
decide the final action. Defaults are `redact_at = 0.4` and `block_at = 0.75`.
```{r}
guardrails <- policy()
guardrails$thresholds
```
## Regex Rules
Regex rules are the simplest way to redact or block recognizable text.
```{r}
guardrails <- add_rule(
guardrails,
id = "llm02.ticket_id",
pattern = "\\bTICKET-[0-9]{6}\\b",
owasp = "llm02",
severity = "medium",
action = "redact",
description = "Internal support ticket identifier."
)
scan_prompt("Summarize TICKET-123456 for the support team.", guardrails)
```
## Function Rules
Function rules let you express checks that are easier to write in R than in a
single regular expression.
```{r}
contains_student_address <- function(text) {
grepl("\\bstudent\\b", text, ignore.case = TRUE) &&
grepl("\\bhome address\\b", text, ignore.case = TRUE)
}
education <- policy("education_safe")
education <- add_rule(
education,
id = "llm02.student.address",
fn = contains_student_address,
owasp = "llm02",
severity = "high",
action = "redact",
description = "Student home address reference."
)
scan_prompt("The student home address appears in the form.", education)
```
Function rules can also return span-aware findings:
```{r}
ticket_span_rule <- function(text) {
hit <- regexpr("\\bTICKET-[0-9]{6}\\b", text, perl = TRUE)
if (identical(as.integer(hit[[1]]), -1L)) {
return(FALSE)
}
start <- as.integer(hit[[1]])
end <- start + as.integer(attr(hit, "match.length")) - 1L
list(
rule_id = "llm02.ticket_id.fn",
owasp = "llm02",
severity = "medium",
action = "redact",
description = "Internal support ticket identifier.",
match = substr(text, start, end),
start = start,
end = end
)
}
```
## Industry Examples
Healthcare and life sciences often add identifiers beyond generic PII.
```{r}
pharma <- policy("pharma_gxp")
pharma <- add_rule(
pharma,
id = "llm02.site_id",
pattern = "\\bSITE-[0-9]{3}\\b",
owasp = "llm02",
severity = "medium",
action = "redact",
description = "Clinical trial site identifier."
)
```
Finance workflows often tighten language around recommendations and promises.
```{r}
finance <- policy("finance_strict")
finance <- add_rule(
finance,
id = "llm09.promissory_return",
pattern = "(?i)guaranteed\\s+(alpha|profit|return)",
owasp = "llm09",
severity = "critical",
action = "block",
description = "Promissory investment performance claim."
)
```
## Rule Inventory
Use `list_rules()` to inspect a policy before deployment.
```{r}
list_rules(guardrails)
```
The resulting table includes `has_pattern` and `has_fn`, which make it easy to
audit whether a policy is mostly regex-based, function-based, or mixed.
Custom rule ids that do not follow the `llmXX.` naming convention still work,
but `shieldr_rule()` warns because OWASP risk summaries are clearest when rule
ids carry the category prefix.
::: {.llmshieldr-info-box}
## Rule Test Checklist
For every new rule, keep at least:
- one positive case that should trigger the rule,
- one nearby negative case that should not trigger,
- one redaction assertion when the rule should redact,
- one policy-level assertion when the rule should block,
- one domain-specific benign case if the rule targets clinical, finance,
education, developer, or other specialized text.
The packaged evaluation corpus at `inst/extdata/security_eval_cases.csv` is a
small starting point for these cases. Add application-specific corpora outside
the package when examples contain real or sensitive data.
:::