Skip to content

Latest commit

 

History

History
136 lines (74 loc) · 3.69 KB

README.org

File metadata and controls

136 lines (74 loc) · 3.69 KB

emt.el

Introduction

EMT stands for Emacs MacOS Tokenizer.

This package use macOS’s built-in NLP tokenizer to tokenize and operate on CJK words in Emacs.

Installation

Requirements

  • macOS 10.15 or later
  • Emacs 26.1 or later, built with dynamic module support (use --with-modules during compilation)

Build dynamic module

Pre-built (recommendation)

If you enable emt-mode and the module cannot be found, it will prompt whether to automatically download it from GitHub. Or you can manually retrieve the pre-built module from the releases section and place the dylib file in the emacs-macos-tokenizer-lib-path (by default, it is located at modules/libEMT.dylib within your personal configuration folder, normally ~/.emacs.d/modules/libEMT.dylib).

Current version of the dynamic module is v2.0.0, make sure you have updated to latest module.

Manually build

  • Install Xcode.
  • Build the module using emt-compile-module, which compiles and copies the module to emt-lib-path.

If you enconter the folloing error:

No such module “PackageDescription”

run the following command and try again:

sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer

Install package

Install with straight and use-package:

(use-package emt
  :straight (:host github :repo "roife/emt"
                   :files ("*.el" "module/*" "module"))
  :hook (after-init . emt-mode))

Customization

emt-use-cache

Caches for results of tokenization if non-nil. Default is t.

emt-cache-lru-size

The size of LRU cache. Default is 50.

emt-lib-path

The path to the directory of dynamic library for emt. Default is ~/.emacs.d/modules/libEMT.dylib.

Usage

keymap: emt-mode-map

It remaps forward-word, backward-word, kill-word and backward-kill-word to use emt’s version.

Minor mode

It calls emt-ensure, which load dynamic modeuls and set emt-mode-map.

Functions

emt-word-at-point-or-forward

Return the word at point. If current point is at bound of a word, return the one forward.

emt-word-at-point-or-backward

Return the word at point. If current point is at bound of a word, return the one backward.

emt-compile-module

Compile and copy the module to emt-lib-path.

It takes an optional argument path, which is the path to the directory of dynamic library. By default, path is set to emt-lib-path.

emt-download-module

Download dynamic module from https://github.com/roife/emt/releases/download/<VERSION>/libEMT.dylib.

If PATH is non-nil, download the module to PATH.

emt-ensure

Load dynamic module.

emt-split

Split string into a list of words.

Return a list of word bounds (a cons of the beginning position and the ending position of a word)

emt-forward-word

CJK compatible version of forward-word.

emt-backward-word

CJK compatible version of backward-word.

emt-kill-word

CJK compatible version of kill-word.

emt-backward-kill-word

CJK compatible version of backward-kill-word.

emt-mark-word

CJK compatible version of mark-word.

Alternatives

Acknowledgements

This package is inspired by jieba.el which is a Chinese tokenizer for Emacs using jieba.

The dynamic module uses emacs-swift-module, which provides an interface for writing Emacs dynamic modules in Swift.