| Did you know ... | Search Documentation: |
| Pack logtalk -- logtalk-3.98.0/library/stemming/NOTES.md |
This file is part of Logtalk https://logtalk.org/ SPDX-FileCopyrightText: 1998-2026 Paulo Moura <pmoura@logtalk.org> SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
stemmingThis library provides word stemming predicates for English text, with support for different word representations: atoms, character lists, or character code lists.
The library includes implementations of two well-known stemming algorithms:
Open the [../../apis/library_index.html#stemming](../../apis/library_index.html#stemming) link in a web browser.
To load all entities in this library, load the loader.lgt file:
| ?- logtalk_load(stemming(loader)).
To test this library predicates, load the tester.lgt file:
| ?- logtalk_load(stemming(tester)).
The stemming predicates are defined in parametric objects where the parameter specifies the word representation:
atom - words are represented as atomschars - words are represented as lists of characterscodes - words are represented as lists of character codes
The parameter must be bound when sending messages to the objects.To stem a single word using atoms:
| ?- porter_stemmer(atom)::stem(running, Stem).
Stem = run
yes
To stem a list of words:
| ?- porter_stemmer(atom)::stems([running, walks, easily], Stems).
Stems = [run, walk, easili]
yes
Using character lists:
| ?- porter_stemmer(chars)::stem([r,u,n,n,i,n,g], Stem).
Stem = [r,u,n]
yes
To stem a single word using atoms:
| ?- lovins_stemmer(atom)::stem(running, Stem).
Stem = run
yes
To stem a list of words:
| ?- lovins_stemmer(atom)::stems([running, walks, easily], Stems).
Stems = [run, walk, eas]
yes
The Porter stemming algorithm, developed by Martin Porter in 1980, is one of the most widely used stemming algorithms for the English language. It operates through a series of steps that progressively remove suffixes from words:
Reference: Porter, M.F. (1980). An algorithm for suffix stripping. Program, 14(3), 130-137.
The Lovins stemming algorithm, developed by Julie Beth Lovins in 1968, was one of the earliest stemming algorithms. It takes a different approach from Porter:
The Lovins algorithm tends to be more aggressive than Porter, sometimes producing stems that are not actual words but are consistent across related word forms.
Reference: Lovins, J.B. (1968). Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, 11(1-2), 22-31.
Both algorithms are designed for English text only.