Skip to content

Commit 5e33670

Browse files
committed
docs(man): write about the way to use regular expressions to choose a parser
Signed-off-by: Masatake YAMATO <[email protected]>
1 parent f05ad97 commit 5e33670

File tree

4 files changed

+158
-42
lines changed

4 files changed

+158
-42
lines changed

docs/man/ctags-optlib.7.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ readers should read :ref:`ctags(1) <ctags(1)>` of Universal Ctags first.
3131
Following options are for defining (or customizing) a parser:
3232

3333
* ``--langdef=<name>``
34-
* ``--map-<LANG>=[+|-]<extension>|<pattern>``
34+
* ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
3535
* ``--kinddef-<LANG>=<letter>,<name>,<description>``
3636
* ``--regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/[<flags>]``
3737
* ``--mline-regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/{mgroup=<N>}[<flags>]``
@@ -103,7 +103,7 @@ Overview for defining a parser
103103

104104
3. Give a file pattern or file extension for activating the parser
105105

106-
Use ``--map-<LANG>=[+|-]<extension>|<pattern>``.
106+
Use ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``.
107107

108108
4. Define kinds
109109

docs/man/ctags.1.rst

Lines changed: 77 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -499,26 +499,71 @@ Language Selection and Mapping Options
499499
Exuberant Ctags. See :ref:`ctags-incompatibilities(7) <ctags-incompatibilities(7)>` for the background of
500500
this incompatible change.
501501

502-
``--map-<LANG>=[+|-]<extension>|<pattern>``
502+
Unlike ``--map-<LANG>`` option, you cannot specify relative-path regular
503+
expressions to ``--langmap`` option.
504+
505+
``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
503506
This option provides the way to control mapping(s) of file names to
504507
languages in a more fine-grained way than ``--langmap`` option.
505508

506509
In ctags, more than one language can map to a
507-
file name *<pattern>* or file *<extension>* (*N:1 map*). Alternatively,
508-
``--langmap`` option handle only *1:1 map*, only one language
509-
mapping to one file name *<pattern>* or file *<extension>*. A typical N:1
510-
map is seen in C++ and ObjectiveC language; both languages have
511-
a map to ``.h`` as a file extension.
512-
513-
A file extension is specified by preceding the extension with a period (e.g. ``.c``).
514-
A file name pattern is specified by enclosing the pattern in parentheses (e.g.
515-
``([Mm]akefile)``). A prefixed plus ('``+``') sign is for adding, and
510+
relative-path regular expression (*<rexpr>*), file name *<pattern>*, or
511+
file *<extension>* (*N:1 map*). Alternatively, ``--langmap``
512+
option handle only *1:1 map*, only one language mapping to one
513+
file name *<pattern>* or file *<extension>*. A typical N:1 map is
514+
seen in C++ and ObjectiveC language; both languages have a map to
515+
``.h`` as a file extension.
516+
517+
A file extension is specified by preceding the extension with a period
518+
(e.g. ``.c``). A file name pattern is specified by enclosing the pattern in
519+
parentheses (e.g. ``([Mm]akefile)``). A relative-path regular expression is
520+
specified by enclosing the expressions in percent signs '``%``'
521+
(e.g. ``%include/.*\.h%``). To include a literal percent sign
522+
inside the regular expression, escape it as ``\%``.
523+
524+
A prefixed plus ('``+``') sign is for adding, and
516525
minus ('``-``') is for removing. No prefix means replacing the map of *<LANG>*.
517526

518-
Unlike ``--langmap``, *<extension>* (or *<pattern>*) is not a list.
519-
``--map-<LANG>`` takes one extension (or pattern). However,
520-
the option can be specified with different arguments multiple times
521-
in a command line.
527+
Unlike ``--langmap``, ``--map-<LANG>`` does not take a list; ``--map-<LANG>``
528+
takes one extension, one pattern, or one regular expression. However, the
529+
option can be specified with different arguments multiple times in a command
530+
line.
531+
532+
For file extensions and file name patterns, the match is performed
533+
with a base file name, a file without any directory components.
534+
For relative-path regular expressions, the match is performed with
535+
a relative-path incorporating the directory components. A
536+
relative-path is relative to the directory where ctags launches.
537+
538+
Assume your shell is in ``/project/x`` directory and you have the following
539+
source tree under the directory.
540+
541+
.. code-block::
542+
543+
src
544+
└── lib
545+
├── data.c
546+
└── logic.c
547+
548+
If you run ctags with ``ctags -R src``,
549+
the match is performed with ``src/lib/data.c`` and ``src/lib/logic.c`` If you
550+
give ``--map-YourParser='%src/lib/.*\.c%'``, ctags
551+
chooses ``YourParser`` parser for processing ``data.c`` and ``logic.c`` in the
552+
tree.
553+
554+
If your shell is in ``/project/x/src`` and you run
555+
``ctags -R lib``, ctags may not choose
556+
``YourParser`` because the match is performed with ``lib/data.c`` and
557+
``lib/logic.c``.
558+
559+
A relative-path regular expression can take a flag controlling its testing.
560+
The flag comes after the last percent sign. Currently only one available flag:
561+
562+
``{icase}`` (one-letter form '``i``')
563+
The regular expression is to be applied in a case-insensitive
564+
manner. (e.g. ``%include/.*\.h%i`` or ``%include/.*\.h%{icase}``
565+
566+
The relative-path regular expression is available since version 6.3.0.
522567

523568
.. _option_tags_file_contents:
524569

@@ -1243,14 +1288,24 @@ Listing Options
12431288
languages, and then exits.
12441289
``all`` is used as default value if the option argument is omitted.
12451290

1246-
``--list-maps[=(<language>|all)]``
1247-
Lists file name patterns and the file extensions which associate a file
1291+
``--list-map-rexprs[=(<language>|all)]``
1292+
Lists the relative-path regular expressions which associate a file
12481293
name with a language for either the specified *<language>* or ``all``
12491294
languages, and then exits.
12501295
``all`` is used as default value if the option argument is omitted.
12511296

1252-
To list the file extensions or file name patterns individually, use
1253-
``--list-map-extensions`` or ``--list-map-patterns`` option.
1297+
(since version 6.3.0)
1298+
1299+
``--list-maps[=(<language>|all)]``
1300+
Lists the file name patterns, the file extensions, and the relative-path
1301+
regular extensions which associate a file name with a language for either the
1302+
specified *<language>* or ``all`` languages, and then exits.
1303+
``all`` is used as default value if the option argument is omitted.
1304+
1305+
To list the file extensions, file name patterns, or relative-path regular
1306+
expressions individually, use ``--list-map-extensions``,
1307+
``--list-map-patterns``, or ``--list-map-rexprs`` option.
1308+
12541309
See the ``--langmap`` option, and "`Determining file language`_", above.
12551310

12561311
This option does not work with ``--machinable`` nor
@@ -1507,10 +1562,13 @@ are mapped to C++, C and ObjectiveC. These mappings can cause
15071562
issues. ctags tries to select the proper parser
15081563
for the source file by applying heuristics to its content, however
15091564
it is not perfect. In case of issues one can use ``--language-force=<language>``,
1510-
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>``
1565+
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
15111566
options. (Some of the heuristics are applied whether ``--guess-language-eagerly``
15121567
is given or not.)
15131568

1569+
The order of testing is relative-path regular expressions (specified with
1570+
``--map-<LANG>=<rexpr>``), file name patterns, then file extensions.
1571+
15141572
.. TODO: all heuristics??? To be confirmed.
15151573
15161574
Heuristically guessing

man/ctags-optlib.7.rst.in

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ readers should read ctags(1) of Universal Ctags first.
3131
Following options are for defining (or customizing) a parser:
3232

3333
* ``--langdef=<name>``
34-
* ``--map-<LANG>=[+|-]<extension>|<pattern>``
34+
* ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
3535
* ``--kinddef-<LANG>=<letter>,<name>,<description>``
3636
* ``--regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/[<flags>]``
3737
* ``--mline-regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/{mgroup=<N>}[<flags>]``
@@ -103,7 +103,7 @@ Overview for defining a parser
103103

104104
3. Give a file pattern or file extension for activating the parser
105105

106-
Use ``--map-<LANG>=[+|-]<extension>|<pattern>``.
106+
Use ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``.
107107

108108
4. Define kinds
109109

man/ctags.1.rst.in

Lines changed: 77 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -499,26 +499,71 @@ Language Selection and Mapping Options
499499
Exuberant Ctags. See ctags-incompatibilities(7) for the background of
500500
this incompatible change.
501501

502-
``--map-<LANG>=[+|-]<extension>|<pattern>``
502+
Unlike ``--map-<LANG>`` option, you cannot specify relative-path regular
503+
expressions to ``--langmap`` option.
504+
505+
``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
503506
This option provides the way to control mapping(s) of file names to
504507
languages in a more fine-grained way than ``--langmap`` option.
505508

506509
In @CTAGS_NAME_EXECUTABLE@, more than one language can map to a
507-
file name *<pattern>* or file *<extension>* (*N:1 map*). Alternatively,
508-
``--langmap`` option handle only *1:1 map*, only one language
509-
mapping to one file name *<pattern>* or file *<extension>*. A typical N:1
510-
map is seen in C++ and ObjectiveC language; both languages have
511-
a map to ``.h`` as a file extension.
512-
513-
A file extension is specified by preceding the extension with a period (e.g. ``.c``).
514-
A file name pattern is specified by enclosing the pattern in parentheses (e.g.
515-
``([Mm]akefile)``). A prefixed plus ('``+``') sign is for adding, and
510+
relative-path regular expression (*<rexpr>*), file name *<pattern>*, or
511+
file *<extension>* (*N:1 map*). Alternatively, ``--langmap``
512+
option handle only *1:1 map*, only one language mapping to one
513+
file name *<pattern>* or file *<extension>*. A typical N:1 map is
514+
seen in C++ and ObjectiveC language; both languages have a map to
515+
``.h`` as a file extension.
516+
517+
A file extension is specified by preceding the extension with a period
518+
(e.g. ``.c``). A file name pattern is specified by enclosing the pattern in
519+
parentheses (e.g. ``([Mm]akefile)``). A relative-path regular expression is
520+
specified by enclosing the expressions in percent signs '``%``'
521+
(e.g. ``%include/.*\.h%``). To include a literal percent sign
522+
inside the regular expression, escape it as ``\%``.
523+
524+
A prefixed plus ('``+``') sign is for adding, and
516525
minus ('``-``') is for removing. No prefix means replacing the map of *<LANG>*.
517526

518-
Unlike ``--langmap``, *<extension>* (or *<pattern>*) is not a list.
519-
``--map-<LANG>`` takes one extension (or pattern). However,
520-
the option can be specified with different arguments multiple times
521-
in a command line.
527+
Unlike ``--langmap``, ``--map-<LANG>`` does not take a list; ``--map-<LANG>``
528+
takes one extension, one pattern, or one regular expression. However, the
529+
option can be specified with different arguments multiple times in a command
530+
line.
531+
532+
For file extensions and file name patterns, the match is performed
533+
with a base file name, a file without any directory components.
534+
For relative-path regular expressions, the match is performed with
535+
a relative-path incorporating the directory components. A
536+
relative-path is relative to the directory where ctags launches.
537+
538+
Assume your shell is in ``/project/x`` directory and you have the following
539+
source tree under the directory.
540+
541+
.. code-block::
542+
543+
src
544+
└── lib
545+
├── data.c
546+
└── logic.c
547+
548+
If you run @CTAGS_NAME_EXECUTABLE@ with ``@CTAGS_NAME_EXECUTABLE@ -R src``,
549+
the match is performed with ``src/lib/data.c`` and ``src/lib/logic.c`` If you
550+
give ``--map-YourParser='%src/lib/.*\.c%'``, @CTAGS_NAME_EXECUTABLE@
551+
chooses ``YourParser`` parser for processing ``data.c`` and ``logic.c`` in the
552+
tree.
553+
554+
If your shell is in ``/project/x/src`` and you run
555+
``@CTAGS_NAME_EXECUTABLE@ -R lib``, @CTAGS_NAME_EXECUTABLE@ may not choose
556+
``YourParser`` because the match is performed with ``lib/data.c`` and
557+
``lib/logic.c``.
558+
559+
A relative-path regular expression can take a flag controlling its testing.
560+
The flag comes after the last percent sign. Currently only one available flag:
561+
562+
``{icase}`` (one-letter form '``i``')
563+
The regular expression is to be applied in a case-insensitive
564+
manner. (e.g. ``%include/.*\.h%i`` or ``%include/.*\.h%{icase}``
565+
566+
The relative-path regular expression is available since version 6.3.0.
522567

523568
.. _option_tags_file_contents:
524569

@@ -1243,14 +1288,24 @@ Listing Options
12431288
languages, and then exits.
12441289
``all`` is used as default value if the option argument is omitted.
12451290

1246-
``--list-maps[=(<language>|all)]``
1247-
Lists file name patterns and the file extensions which associate a file
1291+
``--list-map-rexprs[=(<language>|all)]``
1292+
Lists the relative-path regular expressions which associate a file
12481293
name with a language for either the specified *<language>* or ``all``
12491294
languages, and then exits.
12501295
``all`` is used as default value if the option argument is omitted.
12511296

1252-
To list the file extensions or file name patterns individually, use
1253-
``--list-map-extensions`` or ``--list-map-patterns`` option.
1297+
(since version 6.3.0)
1298+
1299+
``--list-maps[=(<language>|all)]``
1300+
Lists the file name patterns, the file extensions, and the relative-path
1301+
regular extensions which associate a file name with a language for either the
1302+
specified *<language>* or ``all`` languages, and then exits.
1303+
``all`` is used as default value if the option argument is omitted.
1304+
1305+
To list the file extensions, file name patterns, or relative-path regular
1306+
expressions individually, use ``--list-map-extensions``,
1307+
``--list-map-patterns``, or ``--list-map-rexprs`` option.
1308+
12541309
See the ``--langmap`` option, and "`Determining file language`_", above.
12551310

12561311
This option does not work with ``--machinable`` nor
@@ -1507,10 +1562,13 @@ are mapped to C++, C and ObjectiveC. These mappings can cause
15071562
issues. @CTAGS_NAME_EXECUTABLE@ tries to select the proper parser
15081563
for the source file by applying heuristics to its content, however
15091564
it is not perfect. In case of issues one can use ``--language-force=<language>``,
1510-
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>``
1565+
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
15111566
options. (Some of the heuristics are applied whether ``--guess-language-eagerly``
15121567
is given or not.)
15131568

1569+
The order of testing is relative-path regular expressions (specified with
1570+
``--map-<LANG>=<rexpr>``), file name patterns, then file extensions.
1571+
15141572
.. TODO: all heuristics??? To be confirmed.
15151573

15161574
Heuristically guessing

0 commit comments

Comments
 (0)