Reintroduce header patterns for filetype detection by dmaluka · Pull Request #3208 · micro-editor/micro (original) (raw)
added 11 commits
The original meaning of foundDef was: "we already found the final syntax definition in a user's custom syntax file". After introducing signatures its meaning became: "we found some potential syntax definition in a user's custom syntax file, but we don't know yet if it's the final one". This makes the code confusing and actually buggy.
At least one bug is that if we found some potential filename matches in the user's custom syntax files, we don't search for more matches in the built-in syntax files. Which is wrong: we should keep searching for as many potential matches as possible, in both user's and built-in syntax files, to select the best one among them.
Fix that by restoring the original meaning of foundDef and updating the logic accordingly.
No need to parse a syntax YAML file if we are not going to use it, it's a waste of CPU cycles.
As a preparation for reintroducing header matches.
Replacing header patterns with signature patterns was a mistake, since both have their own uses. So restore support for header regex, while keeping support for signature regex as well.
Replacing header patterns with signature patterns was a mistake, since both are quite different from each other, and both have their uses. In fact, this caused a serious regression: for such files as shell scripts without *.sh extension but with #!/bin/sh inside, filetype detection does not work at all anymore.
Since both header and signature patterns are useful, reintroduce support for header patterns while keeping support for signature patterns as well and make both work nicely together.
Also, unlike in the old implementation (before signatures were introduced), ensure that filename matches take precedence over header matches, i.e. if there is at least one filename match found, all header matches are ignored. This makes the behavior more deterministic and prevents previously observed issues like micro-editor#2894 and micro-editor#3054: wrongly detected filetypes caused by some overly general header patterns.
Precisely, the new behavior is:
- if there is at least one filename match, use filename matches only
- if there are no filename matches, use header matches
- in both cases, try to use signatures to find the best match among multiple filename or header matches
Turning header patterns into signature patterns in all syntax files
was a mistake. The two are different things. In almost all syntax files
those patterns are things like shebangs or or
i.e. things that:
can be (and should be) used for detecting the filetype when there is no
filenamematch (and that is actually the purpose of those patterns, so it's a regression that it doesn't work anymore).should only occur in the first line of the file, not in the first 100 lines or so.
In other words, the old header semantics was exactly what was needed
for those filetypes, while the new signature semantics makes little
sense for them.
So replace signature back with header in most syntax files. Keep
signature only in C++ and Objective-C syntax files, for which it was
actually introduced.
To make it more clear. Why Buffer?
Purely cosmetic change: make the code a bit more readable by reducing its visual "density".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})