Validators¶
mlforge.validators ¶
Validators for feature data quality checks.
Validators are functions that check column values and return ValidationResult indicating whether the check passed or failed. They run on the output of feature functions before metrics are applied.
Example
@feature( source="data/transactions.parquet", keys=["user_id"], validators={ "amount": [not_null(), greater_than(0)], "user_id": [not_null()], }, ) def user_transactions(df): ...
ValidationResult
dataclass
¶
Result of running a validator on a column.
Attributes:
| Name | Type | Description |
|---|---|---|
passed |
bool
|
Whether the validation check passed |
message |
str | None
|
Human-readable description of failure (None if passed) |
failed_count |
int | None
|
Number of rows that failed validation (None if passed) |
Source code in src/mlforge/validators.py
Validator
dataclass
¶
Container for a validator function with metadata.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Display name for the validator (used in error messages) |
fn |
ValidatorFunc
|
The validation function that checks a Series |
Source code in src/mlforge/validators.py
greater_than ¶
Validate that all values are strictly greater than a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float
|
The threshold value (exclusive lower bound) |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are <= threshold |
Example
validators={"amount": [greater_than(0)]}
Source code in src/mlforge/validators.py
greater_than_or_equal ¶
Validate that all values are greater than or equal to a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float
|
The threshold value (inclusive lower bound) |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are < threshold |
Example
validators={"quantity": [greater_than_or_equal(0)]}
Source code in src/mlforge/validators.py
in_range ¶
Validate that all values fall within a specified range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_value
|
float
|
Lower bound of the range |
required |
max_value
|
float
|
Upper bound of the range |
required |
inclusive
|
bool
|
If True, bounds are inclusive. Defaults to True. |
True
|
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are outside the range |
Example
validators={ "age": [in_range(0, 120)], "score": [in_range(0, 1, inclusive=True)], }
Source code in src/mlforge/validators.py
is_in ¶
Validate that all values are in a set of allowed values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
allowed_values
|
list
|
List of valid values |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are not in the allowed set |
Example
validators={"status": [is_in(["pending", "approved", "rejected"])]}
Source code in src/mlforge/validators.py
less_than ¶
Validate that all values are strictly less than a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float
|
The threshold value (exclusive upper bound) |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are >= threshold |
Example
validators={"discount_rate": [less_than(1.0)]}
Source code in src/mlforge/validators.py
less_than_or_equal ¶
Validate that all values are less than or equal to a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float
|
The threshold value (inclusive upper bound) |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values are > threshold |
Example
validators={"percentage": [less_than_or_equal(100)]}
Source code in src/mlforge/validators.py
matches_regex ¶
Validate that all string values match a regex pattern.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Regular expression pattern to match |
required |
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any values don't match the pattern |
Example
validators={"email": [matches_regex(r"^[\w.-]+@[\w.-]+.\w+$")]}
Source code in src/mlforge/validators.py
not_null ¶
Validate that a column contains no null values.
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any null values are found |
Example
validators={"user_id": [not_null()]}
Source code in src/mlforge/validators.py
unique ¶
Validate that all values in a column are unique (no duplicates).
Returns:
| Type | Description |
|---|---|
Validator
|
Validator that fails if any duplicate values are found |
Example
validators={"transaction_id": [unique()]}