This file is part of Logtalk https://logtalk.org/ SPDX-FileCopyrightText: 2016 Barry Evans <barryevans@kyndi.com> SPDX-FileCopyrightText: 2016 Paulo Moura <pmoura@logtalk.org> SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

`dead_code_scanner`

This tool detects likely dead code in Logtalk entities and in Prolog modules compiled as objects. Predicates (and non-terminals) are classified as dead code when:

There is no scope directive for them, and they are not called, directly or indirectly, by any predicate with a (local or inherited) scope directive.
They are listed in uses/2 and use_module/2 directives but not called. Besides dead code, this tool can also help detect other problems in the code that often result in reporting false positives. For example, typos in alias/2 directives, missing scope directives, and missing meta_non_terminal/1 and meta_predicate/1 directives.
Given the possibility of false positives, care must be taken before deleting reported dead code to ensure that it's, in fact, code that is not used. A common cause of false positives is the use of conditional compilation directives to provide implementations for predicates missing in some systems or different predicate implementations per operating-system.

The dead_code_scanner.lgt source file implements the scanning predicates for finding dead code in entities, libraries, and directories. The source file dead_code_scanner_messages.lgt defines the default translations for the messages printed when scanning for dead code. These messages can be intercepted to customize the output, e.g., to make it less verbose or for integration with, e.g., GUI IDEs and continuous integration servers.

API documentation

This tool API documentation is available at:

[../../apis/library_index.html#dead-code-scanner](../../apis/library_index.html#dead-code-scanner)

For sample queries, please see the SCRIPT.txt file in the tool directory.

Loading

This tool can be loaded using the query:

| ?- logtalk_load(dead_code_scanner(loader)).

Testing

To test this tool, load the tester.lgt file:

| ?- logtalk_load(dead_code_scanner(tester)).

Usage

This tool provides a set of predicates that allows scanning entities, libraries, files, and directories. See the tool API documentation for details. The source code to be analyzed should be loaded with the source_data and optimize flags turned on (possibly set from a loader file).

For machine-readable integrations, the diagnostics/2-3, diagnostic/2-3, diagnostics_summary/2-3, and diagnostics_preflight/2-3 predicates can be used to collect dead-code results and post-filter counts without relying on printed messages. Diagnostics are returned as terms of the form:

diagnostic(RuleId, Severity, Confidence, Message, Context, File, Lines, Properties) where:
RuleId distinguishes local dead code from unused uses/2 and use_module/2 resources
Severity is the reported diagnostic severity (error, warning, or note)
Confidence is a triage-oriented rating (high, medium, or low)
Context identifies the entity kind and entity where the diagnostic was found
Properties contains additional structured metadata, including the analysis settings seen for the source file and any class-specific details needed to explain the finding For machine-readable export integrations, the export/3-4 predicates serialize scan results in SARIF format using the json library.
As an example, assume that we want to scan an application with a library alias my_app. The following goals could be used:

| ?- set_logtalk_flag(source_data, on), set_logtalk_flag(optimize, on). yes

| ?- logtalk_load(my_app(loader)). ... yes

| ?- dead_code_scanner::library(my_app). ...

For complex applications that make use of sub-libraries, there are also rlibrary/1-2 predicates that perform a recursive scan of a library and all its sub-libraries. Conversely, we may be interested in scanning a single entity:

| ?- dead_code_scanner::entity(some_object). ...

For other usage examples, see the SCRIPT.txt file in the tool directory.

Excluding code from analysis

A set of options is available to specify code that should be excluded when looking for unused predicates (and non-terminals):

[101, 120, 99, 108, 117, 100, 101, 95, 100, 105, 114, 101, 99, 116, 111, 114, 105, 101, 115, 40, 68, 105, 114, 101, 99, 116, 111, 114, 105, 101, 115, 41]: list of directories to exclude (default is []); all sub-directories of the excluded directories are also excluded; directories may be listed by full or relative path
[101, 120, 99, 108, 117, 100, 101, 95, 102, 105, 108, 101, 115, 40, 70, 105, 108, 101, 115, 41]: list of source files to exclude (default is []); files may be listed by full path or basename, with or without extension
[101, 120, 99, 108, 117, 100, 101, 95, 108, 105, 98, 114, 97, 114, 105, 101, 115, 40, 76, 105, 98, 114, 97, 114, 105, 101, 115, 41]: list of libraries to exclude (default is [startup, scratch_directory])
[101, 120, 99, 108, 117, 100, 101, 95, 101, 110, 116, 105, 116, 105, 101, 115, 40, 69, 110, 116, 105, 116, 105, 101, 115, 41]: list of entities to exclude (default is [])
[101, 120, 99, 108, 117, 100, 101, 95, 112, 114, 101, 100, 105, 99, 97, 116, 101, 115, 40, 80, 114, 101, 100, 105, 99, 97, 116, 101, 115, 41]: list of predicate and non-terminal indicators to exclude from findings (default is []); supports local indicators such as foo/1 and bar//2 plus qualified indicators such as `object::baz/2 and qux/3`
[119, 97, 105, 118, 101, 95, 102, 105, 110, 100, 105, 110, 103, 115, 40, 70, 105, 110, 100, 105, 110, 103, 115, 41]: list of finding patterns to suppress from the results (default is []); intended for reusable CI baselines and supports partially instantiated structured finding terms such as dead_predicate(local_dead_code, medium, _, object, some_object, helper/2, _, _)

Machine-readable summaries

The diagnostics_summary/2-3 predicates return a term of the form:

[100, 105, 97, 103, 110, 111, 115, 116, 105, 99, 115, 95, 115, 117, 109, 109, 97, 114, 121, 40, 84, 97, 114, 103, 101, 116, 44, 32, 84, 111, 116, 97, 108, 67, 111, 110, 116, 101, 120, 116, 115, 44, 32, 84, 111, 116, 97, 108, 68, 105, 97, 103, 110, 111, 115, 116, 105, 99, 115, 44, 32, 66, 114, 101, 97, 107, 100, 111, 119, 110, 44, 32, 67, 111, 110, 116, 101, 120, 116, 83, 117, 109, 109, 97, 114, 105, 101, 115, 41]: where Breakdown is a term of the form:
[100, 105, 97, 103, 110, 111, 115, 116, 105, 99, 95, 98, 114, 101, 97, 107, 100, 111, 119, 110, 40, 82, 117, 108, 101, 67, 111, 117, 110, 116, 115, 44, 32, 83, 101, 118, 101, 114, 105, 116, 121, 67, 111, 117, 110, 116, 115, 44, 32, 67, 111, 110, 102, 105, 100, 101, 110, 99, 101, 67, 111, 117, 110, 116, 115, 41]: with RuleCounts containing rule_count(RuleId, Count) terms, SeverityCounts containing severity_count(Severity, Count) terms, and ConfidenceCounts containing confidence_count(Confidence, Count) terms.
ContextSummaries is a list of terms of the form:
[99, 111, 110, 116, 101, 120, 116, 95, 115, 117, 109, 109, 97, 114, 121, 40, 67, 111, 110, 116, 101, 120, 116, 44, 32, 68, 105, 97, 103, 110, 111, 115, 116, 105, 99, 115, 67, 111, 117, 110, 116, 44, 32, 66, 114, 101, 97, 107, 100, 111, 119, 110, 41]: The summary is computed after applying all exclusions and finding waivers, making it suitable for CI thresholds and regression tracking.

Machine-readable preflight warnings

The diagnostics_preflight/2-3 predicates return an ordered set of warnings describing analysis prerequisites that affect result quality for a target. Warning terms currently use the form:

[112, 114, 101, 102, 108, 105, 103, 104, 116, 95, 105, 115, 115, 117, 101, 40, 109, 105, 115, 115, 105, 110, 103, 95, 97, 110, 97, 108, 121, 115, 105, 115, 95, 112, 114, 101, 114, 101, 113, 117, 105, 115, 105, 116, 101, 44, 32, 83, 101, 118, 101, 114, 105, 116, 121, 44, 32, 77, 101, 115, 115, 97, 103, 101, 44, 32, 67, 111, 110, 116, 101, 120, 116, 44, 32, 70, 105, 108, 101, 44, 32, 76, 105, 110, 101, 115, 44, 32, 80, 114, 111, 112, 101, 114, 116, 105, 101, 115, 41]: where Prerequisite is currently one of:
[115, 111, 117, 114, 99, 101, 95, 100, 97, 116, 97]
[111, 112, 116, 105, 109, 105, 122, 101]: This API is intended for CI and editor integrations that need to surface analysis quality issues without scraping console messages.

Finding classes and confidence

The current finding classes are:

local_dead_code
unused_uses_resource
unused_use_module_resource Confidence is a triage-oriented rating attached to every finding. The JSON schema allows the values high, medium, and low, but the tool currently emits only high and medium:

high

Used when the tool has stronger evidence that the reported code is really
unused.

medium

Used when the result is still useful, but can be affected by missing
compilation information such as optimized call graph data.

low

Reserved for future use. It is accepted by the export schema but is not
currently generated by the scanner.

For local_dead_code, the confidence depends on the effective compilation mode of the analyzed file:

If the file was loaded in normal mode, or otherwise not loaded with optimize(on), the confidence is medium.
If the file was loaded with optimize(on), the confidence is upgraded to high because the compiler can provide more complete call graph information, reducing the risk of false positives caused by missed meta-calls.
The effective mode is inferred from the loaded file itself, not from the current scanner defaults. Therefore, results reflect how the analyzed code was actually compiled.

Unused uses/2 and use_module/2 resources are always reported with high confidence because they are derived from the generated linking clauses and the recorded call graph rather than from heuristic reachability alone.

When scanning using the text-based predicates such as entity/1-2, file/1-2, and all/0-1, the tool now emits preflight warnings whenever the analyzed files were not loaded with source_data(on) or optimize(on) so that potentially less reliable results can be triaged more carefully.

SARIF reports

This tool is a diagnostics producer. To generate SARIF reports from its diagnostics, load the standalone sarif tool and call `sarif::generate(dead_code_scanner, Target, Sink, Options) where TargetandOptions` are the same ones accepted by the diagnostics predicates.

The generated report uses the SARIF 2.1.0 JSON format and is suitable for code scanning integrations that consume SARIF reports. The shared generator includes deterministic result fingerprints derived from the canonical finding and preserves the rule descriptors and triage-oriented severity mapping described below.

SARIF findings are exported using one rule descriptor per finding class:

local_dead_code
unused_uses_resource
unused_use_module_resource Result severities are also triage-aware instead of using a single static SARIF level for all findings:
local_dead_code with high confidence is exported as warning
local_dead_code with medium or low confidence is exported as note
unused_uses_resource with high confidence is exported as error
unused_use_module_resource with high confidence is exported as error This mapping allows CI consumers to distinguish advisory dead-code findings from stronger unused-resource findings without reinterpreting the custom class and confidence properties.
SARIF reports include the same preflight information as invocation-level toolExecutionNotifications, allowing consumers to distinguish prerequisite warnings from dead code findings. When the analyzed code is inside a git repository, the shared generator also includes the current branch and commit hash as optional run properties and a versionControlProvenance entry when the repository URI can be derived.

Integration with the `make` tool

The loader.lgt file sets a make target action that will call the `dead_code_scanner::all` goal whenever the logtalk_make(check) goal (or its top-level abbreviation, {?}) is called.

Caveats

Use of local meta-calls with goal arguments only known at runtime can result in false positives. When using library or user-defined meta-predicates, compilation of the source files with the optimize flag turned on may allow meta-calls to be resolved at compile-time and thus allow calling information for the meta-arguments to be recorded, avoiding false positives for predicates that are only meta-called.

Scanning Prolog modules

This tool can also be applied to Prolog modules that Logtalk is able to compile as objects. For example, if the Prolog module file is named module.pl, try:

| ?- logtalk_load(module, [source_data(on)]).

Due to the lack of standardization of module systems and the abundance of proprietary extensions, this solution is not expected to work for all cases.

Scanning plain Prolog files

This tool can also be applied to plain Prolog code. For example, if the Prolog file is named code.pl, simply define an object including its code:

:- object(code).
        :- include('code.pl').
:- end_object.

Save the object to an e.g. code.lgt file in the same directory as the Prolog file and then load it in debug mode:

| ?- logtalk_load(code, [source_data(on), optimize(on)]).

In alternative, use the object_wrapper_hook provided by the hook_objects library:

| ?- logtalk_load(hook_objects(loader)). ...

| ?- logtalk_load(code, [hook(object_wrapper_hook), source_data(on), optimize(on)]).

With either wrapping solution, pay special attention to any compilation warnings that may signal issues that could prevent the plain Prolog from being fully analyzed when wrapped by an object.

Known issues

Some tests fail when using the ECLIPSe and Trealla Prolog backends due to their implementation of the multifile/1 directive, specifically when the multifile predicates are also dynamic with clauses defined in a source file: reconsulting the file adds the new clauses to the old ones instead of replacing them.

dead_code_scanner