EmacsWiki: Icicles - Completion Methods and Styles (original) (raw)

Icicles provides different methods to complete your minibuffer input, dividing these between keys ‘TAB’ and ‘S-TAB’ (these are the keys by default, but you can use any keys). Icicles calls the methods provided by ‘TAB’ “prefix” completion methods, and it calls the methods provided by ‘S-TAB’ “apropos” completion methods.

Some Icicles Commands Hard-Code Completion Methods

Some Icicles commands that allow for multi-completion input, ignore your current choices of ‘TAB’ and ‘S-TAB’ completion method.

In particular, this is the case for commands ‘icicle-file’ and ‘icicle-buffer’ and similar. Such commands have their own way of matching the multi-completion parts.

If you want to use alternative completion methods for completion of file and buffer names then use a different command. In that case, consider customizing option ‘icicle-top-level-key-bindings’ to remove the default key bindings for such commands.

Vanilla Emacs Styles and Option `completion-styles'

Starting with Emacs 23, Emacs provides completion styles, which, like Icicles completion methods, are different ways to complete your minibuffer input. The available styles are defined by non-option variable ‘completion-styles-alist’. They include ‘basic’, which was the original vanilla completion behavior; ‘partial-completion’; ‘initials’; and (for Emacs 24 and later) ‘substring’. They also include ‘emacs21’ and ‘emacs22’, for the vanilla completion behavior from those Emacs releases. See the Emacs doc for an explanation of completion styles.

In vanilla Emacs there is only one set of completion styles that is ever in effect, defined by option ‘completion-styles’. It is a list of different ways to match your input. Each style in the list is tried, in turn, until one of them successfully completes your input.

All completion candidates you see come from the same style. You have no control over which style will actually be used for any given input, other than ordering the list ahead of time. And you have no way of knowing which style was actually used to produce a given set of candidates. The relation between your input pattern and the matches is thus sometimes not so clear. There is no way to know, for example, that initial matching failed and partial matching succeeded.

In vanilla Emacs the styles of ‘completion-styles’ can only be used together – all or none; they are never alternatives that you can choose at runtime.

Icicles completion methods are instead alternatives – only one is used at a time to complete your input, and you can switch from one method to another easily. For prefix completion (‘TAB’) you switch methods using **` C-(**’. For apropos completion you switch using **` M-(**’.

Prefix Completion Method `vanilla'

When you choose Icicles prefix completion method ‘vanilla’ you get essentially the behavior of vanilla Emacs completion, that is, completion according to a list of styles, which are tried one after the other.

But rather than limiting you to a single styles list (option ‘completion-styles’), you can choose anytime from any of several lists that you define using option **`icicle-completion-style-sets**’. Command **‘icicle-choose-completion-style-set’** does this – it sets the current style set and the value of option ‘completion-styles’ to whichever set you choose (using completion). With a prefix argument, it also saves the new value of ‘completion-styles’ for future Emacs sessions.

And just as it is quick and easy to flip, during completion, from one Icicles completion method to another (using ` C-(’ or ` M-(’), so it is with the completion style sets of method ‘vanilla’. For the duration of the current command, you can change to the next style set using **` C-M-(**’ (command ‘icicle-next-completion-style-set’).

Among other things, this means that you can try completing using one style set and, if that does not succeed, switch to another. Any of the sets in ‘icicle-completion-style-sets’ can contain any number of styles, in any order. In particular, a set can be a singleton, which means that you can selectively try to complete using different individual styles.

Completion method ‘vanilla’ is the only method that is subdivided into styles.

Note too this difference between the use of ‘vanilla’ completion in Icicles and completion in vanilla Emacs: In Icicles your entire minibuffer input is matched – the position of the cursor is irrelevant. In vanilla Emacs you can get different matches depending on where the cursor is.

Icicles Completion Methods

The completion methods available for cycling via ` C-(’ or ` M-(’ are defined by options **`icicle-TAB-completion-methods**’ and **`icicle-S-TAB-completion-methods-alist**’, respectively. The first method in each list is the default (initial) method.

By default, the prefix completion methods (‘TAB’) include ‘vanilla’ (see Prefix Completion Method `vanilla'), ‘basic’ (which is the same as vanilla completion style ‘basic’), and the following methods, which provide different kinds of what might be called “fuzzy” matching:

`fuzzy' – This method uses a fairly sophisticated matching algorithm that seems to account for various typing mistakes. This algorithm is provided by library fuzzy-match.el, so I call its use in Icicles “**fuzzy completion**”. You must have library fuzzy-match.el to use this.
`swank' – This method completes (only) symbols, using the algorithm of el-swank-fuzzy.el – see that library for details.

By default, the apropos completion methods (‘S-TAB’) include ‘apropos’ (regexp matching) and the following methods, which also provide different kinds of what might be called “fuzzy” matching. See Fuzzy Completion for further descriptions of each.

`scatter' – This is a simple, poor man’s fuzzy matching method that I call “**scatter matching**”. Ido calls it “flex” matching. The TextMate editor has the same thing for file-name matching (only), without naming it. It matches the your input characters, in order, against completion candidates, but possibly with intervening (non-newline) characters. It amounts to matching input ‘abc’ as if it were the regexp ` a.*b.*c’.
`SPC scatter' – This is another poor man’s fuzzy method, which is used by some Emacs packages such as Ivy. It matches the parts of your input that are separated by ‘SPC’ characters, matching arbitrary text at the separations between those parts.
`Levenshtein' – This method checks whether two strings differ by at most a given number of character operations, the so-called “Levenshtein distance”. You must have library levenshtein.el to use this.
‘Levenshtein strict’ – Like ‘Levenshtein’, but instead of checking whether a given string is within a given distance of a substring of the other, it checks whether it is within a given distance of the other. Library levenshtein.el is required.
`Jaro-Winkler' – This method gives matching weight to having both (a) more characters that match in the right positions (Jaro) and (b) a longer exact prefix within the first four characters (Winkler). Library fuzzy.el, which is part of package ‘auto-complete’, is required.

If you have your own method of matching then you can use that too, by adding it to option ‘icicle-S-TAB-completion-methods-alist’ for use by ‘S-TAB’.

My own opinion about the relative usefulness of the various completion methods, in order from the most useful: ‘apropos’, ‘basic’, ‘vanilla’, ‘scatter’, ‘SPC scatter’, ‘fuzzy’, ‘Levenshtein’, ‘Jaro-Winkler’, and ‘swank’. YMMV.

Besides all of these completion methods, remember that you can get ordinary substring matching with ‘S-TAB’ by using **` C-`**’ to turn off (toggle) escaping of regexp special characters. With special characters escaped, ‘S-TAB’ does literal substring completion. (You can also get substring completion via completion style ‘substring’.)

Changing Completion Method

You can change completion methods easily at any time, by hitting a key in the minibuffer:

**` C-(**’ (command **‘icicle-next-TAB-completion-method’**) to cycle among ‘TAB’ completion methods: ‘vanilla’, ‘basic’, ‘fuzzy’, and ‘swank’ (‘vanilla’ only for Emacs 23 and later; ‘fuzzy’ only if you have library fuzzy-match.el; ‘swank’ only if you have library el-swank-fuzzy.el).
**` M-(**’ (command **‘icicle-next-S-TAB-completion-method’**) to cycle ‘S-TAB’ completion methods: ‘apropos’, ‘scatter’, ‘SPC scatter’, ‘Levenshtein’, ‘Levenshtein strict’, and ‘Jaro-Winkler’ (only if you have library fuzzy.el, which is part of package ‘auto-complete’.

Repeating ` C-(’ and ‘TAB’ or ` M-(’ and ‘S-TAB’ on the fly for the same input can be a good way to learn the differences between the various completion methods.

If you provide a prefix argument to ` C-(’ or ` M-(’, then the newly chosen method is used only for the current command. More precisely, the previously active method is restored as soon as you return to the top level.

Note this difference when cycling completion style sets using ` C-M-(’: the effect is only for the current command. For method cycling you need to use a prefix argument to affect only the current command. With no prefix argument, ` C-(’ and ` M-(’ affect both the current command and subsequent behavior.

Command-Specific Completion Methods

Sometimes you might want to make a different set of completion methods available during input. You can use options **`icicle-TAB-completion-methods-per-command**’ and **`icicle-S-TAB-completion-methods-per-command**’ to do this. These define the methods to be made available during specific commands that read input with completion. That is, they give you command-specific control over ` C-(’ and ` M-(’.

The per-command control is provided by advising (‘defadvice’) the particular commands. You can also do this interactively, using commands ‘icicle-set-TAB-methods-for-command’ and ‘icicle-set-S-TAB-methods-for-command’. Invoking one of these with a negative prefix argument removes the advice, restoring the default choice of methods for the target command.

For example, the following interaction sets the available ‘TAB’ methods for command ‘icicle-read-color-WYSIWYG’ to fuzzy and basic:

M-x icicle-set-TAB-methods-for-command RET Command: icicle-read-color-WYSIWYG RET TAB methods: fuzzy RET TAB methods: basic RET TAB methods: RET

Fuzzy will be the default method for this command, since it is first.

And the following interaction removes the special treatment for ` C-(’ during ‘icicle-read-color-WYSIWYG’, restoring the default ‘TAB’ methods that are defined by option ‘icicle-TAB-completion-methods’:

C-- M-x icicle-set-TAB-methods-for-command RET Command: icicle-read-color-WYSIWYG RET

Fuzzy Completion

This section presents details about the Icicles completion methods that might be called “fuzzy”.

“Fuzzy” is itself a fuzzy term. The effect of ‘apropos’ (regexp) matching or matching using completion style ‘partial-completion’ can sometimes be thought of as fuzzy. In fact, the same could be said of any matching that ignores some of your input. For example, ‘partial-completion’ can be similar to ‘scatter’ completion, but it requires you to explicitly mark where to skip ahead (using ‘*’, ‘ ’ (space), or ‘-’).

Scatter-Match (Flex) Completion

What Icicles calls “scatter-match” completion (‘S-TAB’ completion method ‘scatter’) is sometimes called “flex” completion (for Ido, for example).

The idea is very simple: input characters are matched in order against completion candidates, but possibly with intervening (non-newline) characters. That is, your input scatter-matches a completion candidate if each character is also in the candidate, and the character order is respected.

What this really amounts to is matching input ‘abc’ as if it were the regexp ` a.*b.*c’. That’s all.

You can use Icicles scatter matching in place of apropos (regexp) matching. Unlike the cases of swank and fuzzy-match completion (see below), you can use it to complete file names also.

`SPC' Scatter-Match Completion

This is the method that library Ivy uses for its “fuzzy” matching.

This is like method ‘scatter’, except that instead of matching each input character, allowing matching of arbitrary (non-newline) text between them, it matches each sequence of non-‘SPC’ characters, allowing matching of arbitrary (non-newline) text between them.

More precisely, this acts as if a single ‘SPC’ character of a sequence of ‘SPC’ characters in your input were ` .*’, leaving the other ‘SPC’ characters in that sequence to be matched literally. It amounts to matching input ‘abc def gh i’ as if it were the regexp (this cannot be shown on Emacs Wiki – imagine four ‘SPC’ characters between ‘abc’ and ‘def’, two between ‘def’ and ‘gh’, one between ‘gh’ and ‘i’) as the regexp ` abc .*def .*gh.*i’ (imagine three ‘SPC’ characters between ‘abc’ and ‘def’, and two between ‘def’ and ‘gh’).

Swank (Fuzzy Symbol) Completion

Library el-swank-fuzzy.el is required for Icicles to use fuzzy-symbol completion.

If you choose ‘swank’ ‘TAB’ completion, what you get in Icicles is fuzzy-match completion, but only for symbols. Symbols are completed using the algorithm of el-swank-fuzzy.el. See that library for details.

Icicles options **`icicle-swank-timeout**’ and **`icicle-swank-prefix-length**’ give you some control over the behavior. When the ‘TAB’ completion method is ‘swank’, you can use ‘C-x 1’ (` icicle-doremi-increment-swank-timeout+’) and **‘C-x 2’** (` icicle-doremi-increment-swank-prefix-length+’) in the minibuffer to increment these options on the fly using the arrow keys ‘up’ and ‘down’.

Swank symbol completion uses heuristics that relate to supposedly typical patterns found in symbol names. It also uses a timeout that can limit the number of matches. It is generally quite a bit slower than fuzzy completion, and it sometimes does not provide all candidates that you might think should match, even when all of your input is a prefix (or even when it is already complete!).

If swank completion produces no match when you think it should, remember that you can use ` C-(’ on the fly to change the completion method.

I do not necessarily recommend swank symbol completion, but it is available for those who appreciate it.

Like fuzzy-match completion, swank completion always sorts candidate symbols according to its own scoring, putting what it thinks are the best matches first. This means that using ` C-,’ in the minibuffer to sort candidates differently has no effect.

Fuzzy-Match Completion

Library fuzzy-match.el is required for Icicles to use fuzzy-match completion.

Fuzzy-match completion (‘S-TAB’ completion method ‘fuzzy’) takes more explaining. It is described in detail in the commentary of library fuzzy-match.el. Here are some things to keep in mind when you use Icicles fuzzy-match completion, which goes by the name ‘fuzzy’:

It reverts to basic prefix completion for file names. That is, file-name completion is never fuzzy.
It is always case-sensitive. This means that ‘C-A’ in the minibuffer (to toggle case sensitivity) has no effect on ‘fuzzy’ completion.
It always takes a space prefix in your input into account. This means that ` M-_’ in the minibuffer has no effect on ‘fuzzy’ completion.
Completion candidates are always sorted by decreasing match strength. This means that using ` C-,’ in the minibuffer to sort candidates differently has no effect.

Fuzzy-match completion is a form of prefix completion in which some input characters might not be present in a matched candidate. Matching finds the candidates that have the most characters in common with your input, in the same order and with a minimum of non-matching characters. It can skip over non-matching characters, as long as the number of characters skipped in the candidate is less that those following them that match. After the matching candidates are found, they are sorted by skip length and then candidate length.

Here are some examples:

Input	Completion Domain	Matches (Candidates)
abc	{xxabcxx, xabcxxx, xabx}	{xabcxxx, xxabcxx}
point-mx	Emacs variables	{point-max, point-max-marker}
begining-of-l	Emacs commands	{beginning-of-line, beginning-of-line-text, move-beginning-of-line, widget-beginning-of-line}

The last example shows that although ‘fuzzy’ completion is a kind of prefix completion, your input is not necessarily a prefix of each matching candidate. It tries to match your input starting at its beginning. This input prefix is matched against candidate substrings, not necessarily candidate prefixes, but the non-matching part (if any) preceding the matched substring must not be longer than the matching part. That is, non-matching substrings can be skipped over, but they must be no longer than the matching substrings that follow them. If an input prefix does not match under these conditions, it is skipped over.

After matching an input prefix this way, the same process is repeated, recursively, for input text following that prefix and for match positions following the matches found. That is, after each such prefix match, the process starts again where it left off in both the input and the candidates. The resulting matches contain one or more substrings of your input that are each at least as long as the non-matching parts that immediately precede them. Only matches with the highest number of matching characters are retained. They are sorted by two criteria: (1) nearness of matches to the start of the candidate and (2) candidate length.

The fuzzy-match algorithm is detailed in library fuzzy-match.el. However, it is easier to get a feel for what it does by trying it than by reading any description. Just give it a try. Do not expect it to rival apropos completion in power or expressivity, however. Instead, think of it as prefix completion for lazy or inaccurate typists! If that sounds like you, then you might find it useful. ;-)

Here are a couple of screenshots of buffer ‘*Completions*’. The first shows command-name matches for the input ‘fo’. The second shows command-name matches for the input ‘fol’.

Command-Name Input `fo'

Command-Name Input `fol'

The first thing to notice is the distribution of candidates for input ‘fo’. Candidates are in decreasing order of match fit:

The nearer the match to the start of the candidate, the better the fit.
The greater the ratio of matched text to unmatched text, the better the fit.

Note too the candidate ‘ifconfig’. First, note that it has no strict match for substring ‘fo’. Its match is in fact in two parts: ‘f’, then ‘o’. Second, note that it is considered a better fuzzy match than the candidate ‘info’. This is because its match (‘f’) is nearer to the start of the candidate (second character, versus third).

The second thing to notice is that when you type the third input character, ‘l’, the candidates are not a subset of the original set that matches ‘fo’. The candidates in the second screenshot all match ‘fol’ in a fuzzy way, even though one of them, ‘mh-folder-mode’, does not match ‘fo’ sufficiently well to be included as a candidate. Why? Because in the ‘fo’ case, the match is only two characters long and it starts after three non-matching characters.

For both screenshots: If all input prefixes are fair game for matching, why doesn’t ‘*Completions*’ also include other command names that match only the prefix ‘f’ and nothing else? Because there is at least one match that matches more than that – only the best matches are retained. In this case, the best matches for input ‘fo’ match both the ‘f’ and the ‘o’, and the best matches for input ‘fol’ match all three of those characters.

Refer to fuzzy-match.el for a precise description of fuzzy matching. It refers to “_matchiness_” for how many characters match and “_closeness_” for the ratio of number of characters matched to candidate length.

Note: It is not practical to try to highlight the exact candidate portions that match different parts of your input. Because fuzzy-match input does not function as a literal string for matching purposes, it is more akin to substring matching than to basic prefix matching. For this reason, regexp-match highlighting is used for fuzzy matching. That is why you see the input ‘fo’ highlighted in ‘*Completions*’ candidates in other than just the prefix position. It is also why the matching ‘f’ and ‘o’ in candidate ‘ifconfig’ are not highlighted: for highlighting purposes, your input is treated as a regexp.

One takeaway here is that fuzzy-match completion is complicated. Rather than try to understand how it works and think ahead in those terms, you just need to get a feel for it – learn by doing. Have fun!

Levenshtein Completion

Library levenshtein.el is required for Icicles to use Levenshtein completion.

The “Levenshtein distance” is the maximum number of character insertions, deletions, or replacements that are needed to transform one string to another. The more similar two strings are, the smaller their Levenshtein distance.

When this kind of ‘S-TAB’ completion is used, Icicles considers your input to match a completion candidate if their Levenshtein distance is no greater than the value of option **`icicle-levenshtein-distance**’. The default value of the option is 1, meaning that the difference is at most one character operation.

Using a strict definition of the distance, this also requires the length of your input to be within the Levenshtein distance of the length of a completion candidate, for it to match. That is quite restrictive.

It is more flexible to consider your input to match a candidate if it is within ‘icicle-levenshtein-distance’ of some substring of the candidate. Because candidate substrings are tested, the length of your input need not be nearly the same as the candidate length.

When you cycle among ‘S-TAB’ completion methods using **` M-(**’, there are thus two choices for Levenshtein completion: ‘Levenshtein’ and ‘Levenshtein strict’. The former is generally more useful.

The larger the value of ‘icicle-levenshtein-distance’, the slower Levenshtein completion becomes, since it must test more possibilities. Also, when the value is 1 (except for ` Levenshtein strict’), *Icicles** uses a fast, special-case algorithm, and it highlights the matching parts of candidates in buffer `‘Completions*’`. 1 is the most useful value.

If the value is other than 1 (or if it is 1 with ‘Levenshtein strict’), then you must also use library levenshtein.el, and Levenshtein completion can be quite slow. In that case, you will no doubt want to turn off incremental completion (` C-#’).

Jaro-Winkler Completion

Library fuzzy.el, from package ‘auto-complete’, is required for Icicles to use Jaro-Winkler completion.

The Jaro-Winkler ‘S-TAB’ completion method was originally developed for comparing names for the U.S. census. It tends to take into account some typical spelling mistakes, and it is best suited for use with short candidates.

When checking whether two strings match, higher matching weight results when there are more characters in each string that are also present in the other, and in approximately the same positions.

Looking only at those characters that nearly match in this sense (same character in about the same position), the more exact matches there are (same character in exactly the same position), the higher the matching weight. That is, weight is reduced for characters that nearly match but are not quite in the right position.

So far, this describes Jaro matching. The Jaro matching weight is the average of three values; (a) the ratio of the first string’s near matches to its length, the same for the second string, and (c) the ratio of exact matches to total matches (near and exact).

The Winkler part of the method comes from giving additional weight for prefixes that match exactly. The longer the exact prefix match (up to 4 characters) the greater the weight.

Unlike the other matching methods, for Jaro-Winkler to complete your input it must have the same number of characters as the candidate to be matched, plus or minus two (actually ‘fuzzy-accept-length-difference’). In particular, this means that you cannot hit ‘S-TAB’ with an empty minibuffer to see all of the candidates.

See Also:

Icicles Multi M-x for completion of command abbreviations
Icicles - Apropos Completions for completion with regexp matching
Icicles - Multi-Completions
WikiPedia:Jaro-Winkler distance for information about Jaro-Winkler matching

DrewsElispLibraries referenced here: Lisp:icicles.el

CategoryCommands CategoryCompletion CategoryRegexp CategoryDocumentation CategoryHelp CategoryProgrammerUtils CategoryCode