---
title: "Custom Rules"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Custom Rules}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = TRUE)
```

```{css, echo = FALSE, eval = TRUE}
.llmshieldr-info-box {
  border-left: 4px solid #2f80ed;
  background: #f3f8ff;
  padding: 1rem 1.15rem;
  margin: 1.5rem 0;
  border-radius: 0.35rem;
}

.llmshieldr-info-box h2,
.llmshieldr-info-box h3,
.llmshieldr-info-box h4 {
  margin-top: 0;
}

.llmshieldr-info-box p:last-child,
.llmshieldr-info-box ul:last-child,
.llmshieldr-info-box ol:last-child {
  margin-bottom: 0;
}
```

Policies are lists of `shieldr_rule` objects plus thresholds. You can start
with a built-in policy and append domain-specific rules.

```{r}
library(llmshieldr)
```

For the source model behind the built-in policies, see
`vignette("policy-design", package = "llmshieldr")`.

## Rule Fields

Every rule has the same shape:

- `id`: unique rule identifier. The recommended convention is
  `llmXX.category.name`, such as `llm02.ticket_id`.
- `pattern`: regex pattern, or `NULL`
- `fn`: R predicate function, or `NULL`
- `owasp`: OWASP LLM category such as `llm02`
- `severity`: `low`, `medium`, `high`, or `critical`
- `action`: `allow`, `redact`, or `block`
- `description`: human-readable explanation

Exactly one of `pattern` or `fn` must be supplied. Regex rules produce match
spans that can be redacted. Function rules are useful when the condition is
easier to express in R.

Function rules may return:

- `TRUE` or `FALSE`
- one finding list
- a list of finding lists
- a data frame of findings

Finding lists can include `rule_id`, `owasp`, `severity`, `action`,
`description`, `match`, `start`, `end`, and `source`. Include `start` and
`end` when you want custom function findings to participate in redaction.

## Numbers and Thresholds

Severity maps to risk score contributions:

| Severity | Contribution |
| --- | ---: |
| `low` | 0.1 |
| `medium` | 0.3 |
| `high` | 0.6 |
| `critical` | 1.0 |

Findings are deduplicated, overlapping spans from the same evidence are scored
once, distinct findings are summed, and the total is capped at `1.0`.
Synthetic context findings are capped separately. A policy's thresholds then
decide the final action. Defaults are `redact_at = 0.4` and `block_at = 0.75`.

```{r}
guardrails <- policy()
guardrails$thresholds
```

## Regex Rules

Regex rules are the simplest way to redact or block recognizable text.

```{r}
guardrails <- add_rule(
  guardrails,
  id = "llm02.ticket_id",
  pattern = "\\bTICKET-[0-9]{6}\\b",
  owasp = "llm02",
  severity = "medium",
  action = "redact",
  description = "Internal support ticket identifier."
)

scan_prompt("Summarize TICKET-123456 for the support team.", guardrails)
```

## Function Rules

Function rules let you express checks that are easier to write in R than in a
single regular expression.

```{r}
contains_student_address <- function(text) {
  grepl("\\bstudent\\b", text, ignore.case = TRUE) &&
    grepl("\\bhome address\\b", text, ignore.case = TRUE)
}

education <- policy("education_safe")
education <- add_rule(
  education,
  id = "llm02.student.address",
  fn = contains_student_address,
  owasp = "llm02",
  severity = "high",
  action = "redact",
  description = "Student home address reference."
)

scan_prompt("The student home address appears in the form.", education)
```

Function rules can also return span-aware findings:

```{r}
ticket_span_rule <- function(text) {
  hit <- regexpr("\\bTICKET-[0-9]{6}\\b", text, perl = TRUE)
  if (identical(as.integer(hit[[1]]), -1L)) {
    return(FALSE)
  }
  start <- as.integer(hit[[1]])
  end <- start + as.integer(attr(hit, "match.length")) - 1L
  list(
    rule_id = "llm02.ticket_id.fn",
    owasp = "llm02",
    severity = "medium",
    action = "redact",
    description = "Internal support ticket identifier.",
    match = substr(text, start, end),
    start = start,
    end = end
  )
}
```

## Industry Examples

Healthcare and life sciences often add identifiers beyond generic PII.

```{r}
pharma <- policy("pharma_gxp")
pharma <- add_rule(
  pharma,
  id = "llm02.site_id",
  pattern = "\\bSITE-[0-9]{3}\\b",
  owasp = "llm02",
  severity = "medium",
  action = "redact",
  description = "Clinical trial site identifier."
)
```

Finance workflows often tighten language around recommendations and promises.

```{r}
finance <- policy("finance_strict")
finance <- add_rule(
  finance,
  id = "llm09.promissory_return",
  pattern = "(?i)guaranteed\\s+(alpha|profit|return)",
  owasp = "llm09",
  severity = "critical",
  action = "block",
  description = "Promissory investment performance claim."
)
```

## Rule Inventory

Use `list_rules()` to inspect a policy before deployment.

```{r}
list_rules(guardrails)
```

The resulting table includes `has_pattern` and `has_fn`, which make it easy to
audit whether a policy is mostly regex-based, function-based, or mixed.

Custom rule ids that do not follow the `llmXX.` naming convention still work,
but `shieldr_rule()` warns because OWASP risk summaries are clearest when rule
ids carry the category prefix.

::: {.llmshieldr-info-box}
## Rule Test Checklist

For every new rule, keep at least:

- one positive case that should trigger the rule,
- one nearby negative case that should not trigger,
- one redaction assertion when the rule should redact,
- one policy-level assertion when the rule should block,
- one domain-specific benign case if the rule targets clinical, finance,
  education, developer, or other specialized text.

The packaged evaluation corpus at `inst/extdata/security_eval_cases.csv` is a
small starting point for these cases. Add application-specific corpora outside
the package when examples contain real or sensitive data.
:::