Skip to content

API documentation

Python bindings for the YARA scanner boreal.

Modules:

Name Description
boreal

Python bindings for the YARA scanner boreal.

Classes:

Name Description
AddRuleError

Raised when failing to compile a rule

CompilerProfile

Profile to use when compiling rules.

Error

Generic boreal error

Match

Details about a matching rule.

Rule

Details about a rule contained in the Scanner object.

RuleString

Details about a string.

RulesIter

Iterator over the rules of a Scanner object.

ScanError

Raised when a scan fails

Scanner

Holds a list of rules, and provides methods to run them on files or bytes.

StringMatchInstance

Details about a single match instance of a string.

StringMatches

Details about the matches of a string.

SyntaxError

Raised when failing to compile a rule

TimeoutError

Raised when a scan times out

Functions:

Name Description
compile

Compile YARA rules and generate a Scanner object.

load

Load rules from a serialized scanner object.

set_config

Modify some global parameters

Attributes:

Name Type Description
CALLBACK_ABORT int

Return value used in callbacks to abort the scan.

CALLBACK_ALL int

Call the match callback after a rule is evaluated.

CALLBACK_CONTINUE int

Return value used in callbacks to signal the scan must continue.

CALLBACK_MATCHES int

Call the match callback when a rule matches.

CALLBACK_NON_MATCHES int

Call the match callback when a rule does not match.

CALLBACK_TOO_MANY_MATCHES int

A string has had too many matches.

__version__ str

Version of the boreal-py library

modules list[str]

List of availables modules

Attributes

CALLBACK_ABORT module

CALLBACK_ABORT: int = 1

Return value used in callbacks to abort the scan.

Callbacks used in the match method should return this value to abort the scan. If the scan is aborted, the match method will not raise any exception but will end immediately, returning the results it has computed so far.

CALLBACK_ALL module

CALLBACK_ALL: int = 3

Call the match callback after a rule is evaluated.

If specified in the which_callbacks parameter of the match method the callback will be called after a is evaluated, regardless of whether it has matched or not. the matches attribute of the passed rule can be used to know if the rule has matched or not.

CALLBACK_CONTINUE module

CALLBACK_CONTINUE: int = 0

Return value used in callbacks to signal the scan must continue.

Callbacks used in the match method should return this value to keep the scan going.

CALLBACK_MATCHES module

CALLBACK_MATCHES: int = 1

Call the match callback when a rule matches.

If specified in the which_callbacks parameter of the match method, the callback will be called when a rule matches.

CALLBACK_NON_MATCHES module

CALLBACK_NON_MATCHES: int = 2

Call the match callback when a rule does not match.

If specified in the which_callbacks parameter of the match method, the callback will be called when a rule does not match.

CALLBACK_TOO_MANY_MATCHES module

CALLBACK_TOO_MANY_MATCHES: int = 6

A string has had too many matches.

This is used in the warnings_callback of the match method to indicate the warning kind.

CallbackResult module-attribute

CallbackResult: TypeAlias = int

Return status that can be returned by a callback.

This must be one of:

ConsoleCallback module-attribute

ConsoleCallback: TypeAlias = Callable[[str], None]

Callback handling uses of the console module in rules.

It receives the log as the lone argument.

ExternalValue module-attribute

ExternalValue: TypeAlias = str | bytes | int | float | bool

The value of an external symbol usable in a rule condition.

IncludeCallback module-attribute

IncludeCallback: TypeAlias = Callable[
    [str, str | None, str], str
]

Callback used to resolve include directives.

Receive three arguments:

  • The path being included.

  • The path of the current document. Can be None if the current document was specified as a string, such as when using the source or sources parameter.

  • The current namespace.

Must return a string which is the included document.

MatchCallback module-attribute

MatchCallback: TypeAlias = Callable[
    [RuleDetails], CallbackResult
]

Callback called when rules are evaluated.

MetadataValue module-attribute

MetadataValue: TypeAlias = bytes | int | bool

The value of a metadata key declared in a rule.

ModulesCallback module-attribute

ModulesCallback: TypeAlias = Callable[
    [dict[str, Any]], CallbackResult
]

Callback called when a module is evaluated.

The callback receives the dynamic values of the module as the first argument. The name of the module is accessible with the "module" key.

WarningCallback module-attribute

WarningCallback: TypeAlias = Callable[
    [WarningType, RuleString], CallbackResult
]

Callback called when a warning is emitted during a scan.

WarningType module-attribute

WarningType: TypeAlias = int

Type of warning passed to the warning callback.

This can be one of:

__version__ module

__version__: str = '1.0.0'

Version of the boreal-py library

modules module

modules: list[str]

List of availables modules

Classes

AddRuleError

Bases: Error

Raised when failing to compile a rule

CompilerProfile

Profile to use when compiling rules.

Attributes:

Name Type Description
Memory CompilerProfile

Profile to use when compiling rules.

Speed CompilerProfile

Profile to use when compiling rules.

Attributes

Memory instance-attribute

Profile to use when compiling rules.

Speed instance-attribute

Profile to use when compiling rules.

Error

Bases: Exception

Generic boreal error

Match

Details about a matching rule.

Methods:

Name Description
__eq__

Return self==value.

__ge__

Return self>=value.

__gt__

Return self>value.

__hash__

Return hash(self).

__le__

Return self<=value.

__ne__

Return self!=value.

Attributes:

Name Type Description
meta dict[str, MetadataValue]

Dictionary with metadata associated to the rule

namespace str

Namespace of the matching rule

rule str

Name of the matching rule

strings list[StringMatches]

Details about the string matches of the rule.

tags list[str]

List of tags associated to the rule

Attributes

meta instance-attribute
meta: dict[str, MetadataValue]

Dictionary with metadata associated to the rule

namespace instance-attribute
namespace: str

Namespace of the matching rule

rule instance-attribute
rule: str

Name of the matching rule

strings instance-attribute
strings: list[StringMatches]

Details about the string matches of the rule.

tags instance-attribute
tags: list[str]

List of tags associated to the rule

Functions

__eq__
__eq__(other: object) -> bool

Return self==value.

__ge__
__ge__(other: object) -> bool

Return self>=value.

__gt__
__gt__(other: object) -> bool

Return self>value.

__hash__
__hash__() -> int

Return hash(self).

__le__
__le__(other: object) -> bool

Return self<=value.

__ne__
__ne__(other: object) -> bool

Return self!=value.

Readable

Bases: Protocol

A readable object

Rule

Details about a rule contained in the Scanner object.

Attributes:

Name Type Description
identifier str

Name of the rule

is_global bool

Is the rule global

is_private bool

Is the rule private

meta dict[str, MetadataValue]

Dictionary with metadata associated with the rule

namespace str

Namespace of the rule

tags list[str]

List of tags associated with the rule

Attributes

identifier instance-attribute
identifier: str

Name of the rule

is_global instance-attribute
is_global: bool

Is the rule global

is_private instance-attribute
is_private: bool

Is the rule private

meta instance-attribute
meta: dict[str, MetadataValue]

Dictionary with metadata associated with the rule

namespace instance-attribute
namespace: str

Namespace of the rule

tags instance-attribute
tags: list[str]

List of tags associated with the rule

RuleDetails

Bases: TypedDict

Details about a rule passed to the match callback.

Attributes:

Name Type Description
matches bool

Did the rule match

meta dict[str, MetadataValue]

List of tags associated to the rule

namespace str

Namespace of the matching rule

rule str

Name of the matching rule

strings list[StringMatches]

Details about the string matches of the rule

tags list[str]

Dictionary with metadata associated to the rule

Attributes

matches instance-attribute
matches: bool

Did the rule match

meta instance-attribute
meta: dict[str, MetadataValue]

List of tags associated to the rule

namespace instance-attribute
namespace: str

Namespace of the matching rule

rule instance-attribute
rule: str

Name of the matching rule

strings instance-attribute
strings: list[StringMatches]

Details about the string matches of the rule

tags instance-attribute
tags: list[str]

Dictionary with metadata associated to the rule

RuleString

Details about a string.

Attributes:

Name Type Description
namespace str

Namespace of the rule containing the string.

rule str

Name of the rule containing the string.

string str

Name of the string.

Attributes

namespace instance-attribute
namespace: str

Namespace of the rule containing the string.

rule instance-attribute
rule: str

Name of the rule containing the string.

string instance-attribute
string: str

Name of the string.

RulesIter

Bases: Iterator[Rule]

Iterator over the rules of a Scanner object.

Methods:

Name Description
__iter__

Implement iter(self).

__next__

Implement next(self).

Functions

__iter__
__iter__() -> RulesIter

Implement iter(self).

__next__
__next__() -> Rule

Implement next(self).

ScanError

Bases: Error

Raised when a scan fails

Scanner

Bases: Iterable[Rule]

Holds a list of rules, and provides methods to run them on files or bytes.

Methods:

Name Description
__iter__

Implement iter(self).

match

Scan data against the compiled rules.

save

Save the Scanner object into a bytestring.

set_params

Modify scan parameters.

Attributes:

Name Type Description
warnings list[str]

List of warnings generated when compiling rules.

Attributes

warnings instance-attribute
warnings: list[str]

List of warnings generated when compiling rules.

Functions

__iter__
__iter__() -> RulesIter

Implement iter(self).

match
match(
    filepath: str | None = None,
    data: str | bytes | None = None,
    pid: int | None = None,
    externals: dict[str, ExternalValue] | None = None,
    callback: MatchCallback | None = None,
    which_callbacks: int | None = None,
    fast: bool | None = None,
    timeout: int | None = None,
    modules_data: dict[str, Any] | None = None,
    modules_callback: ModulesCallback | None = None,
    warnings_callback: WarningCallback | None = None,
    console_callback: ConsoleCallback | None = None,
    allow_duplicate_metadata: bool | None = False,
) -> list[Match]

Scan data against the compiled rules.

By default, this function will scan the provided input and return a list of the matching rules. However, this behavior can be customized greatly with different parameters.

One of filepath, data or pid must be specified.

Parameters:

Name Type Description Default
filepath str | None

Path to the file to scan.

None
data str | bytes | None

Data to scan.

None
pid int | None

The pid of the process to scan.

None
externals dict[str, ExternalValue] | None

A dictionary specifying values for external symbols. The keys are the name of the symbols, and the value are the values to use during the scan, in place of the default value specified during compilation. All symbols must have been declared during compilation, see the externals argument in compile().

None
callback MatchCallback | None

Callback called when a rule is evaluated. The which_callbacks argument is used to specify which rules are passed to this callback.

None
which_callbacks int | None

Specify which rules to pass to the callback. This must be one of:

  • CALLBACK_MATCHES: the callback is called when a rule matches.
  • CALLBACK_NON_MATCHES: the callback is called when a rule does not match.
  • CALLBACK_ALL: the callback is called in both cases.

The default value depends on the compatibility mode: it is CALLBACK_ALL if in compat mode, CALLBACK_MATCHES otherwise.

Note that enabling non matching rules disables fast mode.

None
fast bool | None

Enable or disable fast mode. If fast mode is enabled, strings may not be scanned if rules can be evaluated without them. That is, matching rules are not guaranteed to contain details about string matches. The default value depends on the compatibility mode: it is False if in compat mode, and True otherwise.

None
timeout int | None

Specify the number of seconds after which the scan times out.

None
modules_data dict[str, Any] | None

Specify data to pass to modules. This is a dictionary mapping the module name to its data. Only the cuckoo module is supported, and the library must have been built with cuckoo support.

None
modules_callback ModulesCallback | None

Callback called when a module is evaluated. The callback will receive the dynamic values of the module.

None
warnings_callback WarningCallback | None

Callback called when the scan emits a warning.

None
console_callback ConsoleCallback | None

Callback called with the console module is used.

None
allow_duplicate_metadata bool | None

If true, the metadata returned with matching rules will be a dictionary that maps the metadata keys to a list of all values associated with this key. This can be used when multiple metadata with the same key are specified in the same rule.

False

Returns: A list of all the rules that matched.

Raises:

Type Description
TypeError

A provided argument has the wrong type, or none of the input arguments were provided.

ScanError

An error happened during the scan.

TimeoutError

The scan timed out.

save
save(
    filepath: str | None = None,
    file: Writable | None = None,
    to_bytes: bool = False,
) -> bytes | None

Save the Scanner object into a bytestring.

This method allows serializing the object into a bytestring that can then be reloaded at a later date or on another machine using the load function.

See the boreal documentation for more details about this feature and its limitations.

One of filepath, file or to_bytes must be provided.

Parameters:

Name Type Description Default
filepath str | None

The path to the file containing the serialized files.

None
file Writable | None

An opened file where the serialization will be written. This can be any object that exposes a write and a flush method, as long the write method accepts bytes.

None
to_bytes bool

If true, return a bytestring containing the serialization.

False

Returns: The serialize bytestring if to_bytes is true, None otherwise.

Raises:

Type Description
TypeError

A provided argument has the wrong type, or none of the input arguments were provided.

Error

The serialization failed.

set_params
set_params(
    use_mmap: bool | None = None,
    string_max_nb_matches: int | None = None,
    fragmented_scan_mode: str | None = None,
    process_memory: bool | None = None,
    max_fetched_region_size: int | None = None,
    memory_chunk_size: int | None = None,
) -> None

Modify scan parameters.

Those parameters are documented in details in the boreal documentation.

Parameters:

Name Type Description Default
use_mmap bool | None

If true, use mmap to scan files specified by the filepath argument in the match method.

None
string_max_nb_matches int | None

Maximum number of matches for a given string. If this limit is reached, matches are no longer counted nor reported.

None
fragmented_scan_mode str | None

Scan mode to use on fragmented memory, notable process scanning. for more details. This must be one of legacy, fast or single_pass.

None
process_memory bool | None

Scanned bytes are part of the memory of a process.

None
max_fetched_region_size int | None

Maximum size of a fetched region, used during process scanning.

None
memory_chunk_size int | None

Size of memory chunks to scan, used during process scanning.

None

Raises:

Type Description
TypeError

A provided argument has the wrong type

StringMatchInstance

Details about a single match instance of a string.

Methods:

Name Description
__hash__

Return hash(self).

plaintext

The matched data after application of the xor operation.

Attributes:

Name Type Description
matched_data bytes

The matched data.

matched_length int

Length of the entire match before truncation.

offset int

Offset of the match.

xor_key int

Xor key used in the match.

Attributes

matched_data instance-attribute
matched_data: bytes

The matched data.

If the match exceeded the max_matched_data limit specified in the set_config function, the data is truncated.

matched_length instance-attribute
matched_length: int

Length of the entire match before truncation.

This is the actual length of the matched data, which can be different from the length of the matched_data field, since this field can be truncated.

offset instance-attribute
offset: int

Offset of the match.

xor_key instance-attribute
xor_key: int

Xor key used in the match.

Functions

__hash__
__hash__() -> int

Return hash(self).

plaintext
plaintext() -> bytes

The matched data after application of the xor operation.

If the string had a xor modifier, this method can be used to get the matched data after application of the xor key.

StringMatches

Details about the matches of a string.

Methods:

Name Description
__hash__

Return hash(self).

is_xor

Does the string have the xor modifier.

Attributes:

Name Type Description
identifier str

Name of the string.

instances list[StringMatchInstance]

List of matches for the string.

Attributes

identifier instance-attribute
identifier: str

Name of the string.

instances instance-attribute
instances: list[StringMatchInstance]

List of matches for the string.

Functions

__hash__
__hash__() -> int

Return hash(self).

is_xor
is_xor() -> bool

Does the string have the xor modifier.

SyntaxError

Bases: Error

Raised when failing to compile a rule

TimeoutError

Bases: Error

Raised when a scan times out

Writable

Bases: Protocol

A writable object

Functions

compile builtin

compile(
    filepath: str | None = None,
    filepaths: dict[str, str] | None = None,
    source: str | None = None,
    sources: dict[str, str] | None = None,
    file: Readable | None = None,
    externals: dict[str, ExternalValue] | None = None,
    includes: bool = True,
    error_on_warning: bool = False,
    include_callback: IncludeCallback | None = None,
    strict_escape: bool | None = None,
    profile: CompilerProfile | None = None,
) -> Scanner

Compile YARA rules and generate a Scanner object.

One of filepath, filepaths, source, sources or file must be passed.

Parameters:

Name Type Description Default
filepath str | None

Path to a file containing the rules to compile.

None
filepaths dict[str, str] | None

Dictionary where the value is a path to a file, containing rules to compile, and the key is the name of the namespace that will contain those rules.

None
source str | None

String containing the rules to compile.

None
sources dict[str, str] | None

Dictionary where the value is a string containing the rules to compile, and the key is the name of the namespace that will contain those rules.

None
file Readable | None

An opened file containing the rules to compile. This can be any object that exposes a read method.

None
externals dict[str, ExternalValue] | None

Dictionary of externals symbols to make available during compilation. The key is the name of the external symbol, and the value is the original value to assign to this symbol. This original value can be replaced during scanning by specifying an externals dictionary, see the Scanner::match method.

None
includes bool

Allow rules to use the include directive. If set to False, any use of the include directive will result in a compilation error.

True
error_on_warning bool

If true, make the compilation fail when a warning is emitted. If false, warnings can be found in the resulting Scanner object, see Scanner::warnings.

False
include_callback IncludeCallback | None

If specified, this callback is used to resolve callbacks. The callback will receive three arguments: - The path being included. - The path of the current document. Can be None if the current document was specified as a string, such as when using the source or sources parameter. - The current namespace. The callback must return a string which is the included document.

None
strict_escape bool | None

If true, invalid escape sequences in regexes will generate warnings. The default value depends on the yara compatibility mode: it is False if in compat mode, or True otherwise.

None
profile CompilerProfile | None

Profile to use when compiling the rules. If not specified, CompilerProfile::Speed is used.

None

Returns:

Type Description
Scanner

a Scanner object that holds the compiled rules.

Raises:

Type Description
TypeError

A provided argument has the wrong type, or none of the input arguments were provided.

AddRuleError

A rule failed to compile.

load builtin

load(
    filepath: str | None = None,
    file: Readable | None = None,
    data: bytes | None = None,
) -> Scanner

Load rules from a serialized scanner object.

A scanner can be serialized into a bytestring and reloaded using this function.

See the boreal documentation for more details about this feature and its limitations.

One of filepath, file or data must be provided.

Parameters:

Name Type Description Default
filepath str | None

The path to the file containing the serialized files.

None
file Readable | None

An opened file containing the serialized files. This can be any object that exposes a read method, as long as this read method returns bytes.

None
data bytes | None

The serialized bytes.

None

Returns:

Type Description
Scanner

a Scanner object.

Raises:

Type Description
TypeError

A provided argument has the wrong type, or none of the input arguments were provided.

Error

The deserialization failed.

set_config builtin

set_config(
    max_strings_per_rule: int | None = None,
    max_match_data: int | None = None,
    stack_size: int | None = None,
    yara_compatibility: bool | None = None,
) -> None

Modify some global parameters

Parameters:

Name Type Description Default
max_strings_per_rule int | None

Maximum number of strings allowed in a single rule. If a rule has more strings than this limit, its compilation will fail.

None
max_match_data int | None

Maximum length for the match data returned in match results. The match details returned in results will be truncated if they exceed this limit. Default value is 512

None
stack_size int | None

Unused, this is accepted purely for compatibility with yara.

None
yara_compatibility bool | None

Enable or disable full YARA compatibility. See the global documentation of this library for more details.

None

Raises:

Type Description
TypeError

A provided argument has the wrong type