Skip to content

Commit 7d8159c

Browse files
committed
Refactor readme.md
1 parent f7f04d4 commit 7d8159c

File tree

1 file changed

+101
-85
lines changed

1 file changed

+101
-85
lines changed

readme.md

Lines changed: 101 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -2,150 +2,166 @@
22

33
Natural language for human and machine.
44

5-
---
5+
**NLCST** discloses the parts of natural language as a concrete syntax
6+
tree. Concrete means all information is stored in this tree and an
7+
exact replica of the original document can be re-created.
68

7-
> Note: Several projects use this document. Do not make changes without consulting with [TextOM](https://github.com/wooorm/textom), [parse-latin](https://github.com/wooorm/parse-latin), and [retext](https://github.com/wooorm/retext).
9+
**NLCST** is a subset of [**Unist**][unist], and implemented by
10+
[**retext**][retext].
811

9-
## CST
10-
11-
### Node
12+
## Table of Contents
1213

13-
Node represents any unit in NLCST hierarchy.
14+
- [CST](#cst)
1415

15-
```
16-
interface Node {
17-
type: string;
18-
data: Data | null;
19-
}
20-
```
16+
- [Root](#root)
17+
- [Paragraph](#paragraph)
18+
- [Sentence](#sentence)
19+
- [Word](#word)
20+
- [Symbol](#symbol)
21+
- [Punctuation](#punctuation)
22+
- [WhiteSpace](#whitespace)
23+
- [Source](#source)
24+
- [TextNode](#textnode)
2125

22-
### Data
26+
- [List of Utilities](#list-of-utilities)
2327

24-
Data represents data associated with any node. Data is a scope for plug-ins to store any information. Its only limitation being that each property should by stringifyable: not throw when passed to `JSON.stringify()`.
28+
- [License](#license)
2529

26-
```
27-
interface Data { }
28-
```
30+
## CST
2931

30-
### Parent
32+
### `Root`
3133

32-
Parent ([Node](#node)) represents a unit in NLCST hierarchy which can have zero or more children.
34+
`Root` ([`Parent`][parent]) houses all nodes.
3335

34-
```
35-
interface Parent <: Node {
36-
children: [];
36+
```idl
37+
interface Root <: Parent {
38+
type: "RootNode";
3739
}
3840
```
3941

40-
### Text
42+
### `Paragraph`
4143

42-
Text ([Node](#node)) represents a unit in NLCST hierarchy which has value.
44+
`Paragraph` ([`Parent`][parent]) represents a self-contained unit of
45+
discourse in writing dealing with a particular point or idea.
4346

44-
```
45-
interface Text <: Node {
46-
value: string;
47+
```idl
48+
interface Paragraph <: Parent {
49+
type: "ParagraphNode";
4750
}
4851
```
4952

50-
### RootNode
53+
### `Sentence`
5154

52-
Root ([Parent](#parent)) represents a document.
55+
`Sentence` ([`Parent`][parent]) represents grouping of grammatically
56+
linked words, that in principle tells a complete thought, although it
57+
may make little sense taken in isolation out of context.
5358

54-
```
55-
interface RootNode < Parent {
56-
type: "RootNode";
59+
```idl
60+
interface Sentence <: Parent {
61+
type: "SentenceNode";
5762
}
5863
```
5964

60-
### ParagraphNode
65+
### `Word`
6166

62-
Paragraph ([Parent](#parent)) represents a self-contained unit of discourse in writing dealing with a particular point or idea.
67+
`Word` ([`Parent`][parent]) represents the smallest element that may
68+
be uttered in isolation with semantic or pragmatic content.
6369

64-
```
65-
interface ParagraphNode < Parent {
66-
type: "ParagraphNode";
70+
```idl
71+
interface Word <: Parent {
72+
type: "WordNode";
6773
}
6874
```
6975

70-
### SentenceNode
76+
### `Symbol`
7177

72-
Sentence ([Parent](#parent)) represents grouping of grammatically linked words, that in principle tells a complete thought, although it may make little sense taken in isolation out of context.
78+
`Symbol` ([`Text`][text]) represents typographical devices like
79+
white space, punctuation, signs, and more, different from characters
80+
which represent sounds (like letters and numerals).
7381

74-
```
75-
interface SentenceNode < Parent {
76-
type: "SentenceNode";
82+
```idl
83+
interface Symbol <: Text {
84+
type: "SymbolNode";
7785
}
7886
```
7987

80-
### WordNode
88+
### `Punctuation`
8189

82-
Word ([Parent](#parent)) represents the smallest element that may be uttered in isolation with semantic or pragmatic content.
90+
`Punctuation` ([`Symbol`][symbol]) represents typographical devices
91+
which aid understanding and correct reading of other grammatical
92+
units.
8393

84-
```
85-
interface WordNode < Parent {
86-
type: "WordNode";
94+
```idl
95+
interface Punctuation <: Symbol {
96+
type: "PunctuationNode";
8797
}
8898
```
8999

90-
### SymbolNode
100+
### `WhiteSpace`
91101

92-
Symbol ([Text](#text)) represents typographical devices like white space, punctuation, signs, and more, different from characers which represent sounds (like letters and numerals).
102+
`WhiteSpace` ([`Symbol`][symbol]) represents typographical devices
103+
devoid of content, separating other grammatical units.
93104

94-
```
95-
interface SymbolNode < Text {
96-
type: "SymbolNode";
105+
```idl
106+
interface WhiteSpace <: Symbol {
107+
type: "WhiteSpaceNode";
97108
}
98109
```
99110

100-
### PunctuationNode
111+
### `Source`
101112

102-
Punctuation ([SymbolNode](#symbolnode)) represents typographical devices which aid understanding and correct reading of other grammatical units.
113+
`Source` ([`Text`][text]) represents an external (ungrammatical) value
114+
embedded into a grammatical unit: a hyperlink, a line, and such.
103115

104-
```
105-
interface PunctuationNode < SymbolNode {
106-
type: "PunctuationNode";
116+
```idl
117+
interface Source <: Symbol {
118+
type: "SourceNode";
107119
}
108120
```
109121

110-
### WhiteSpaceNode
122+
### `TextNode`
111123

112-
White Space ([SymbolNode](#symbolnode)) represents typographical devices devoid of content, separating other grammatical units.
124+
`TextNode` ([`Text`][text]) represents actual content in an NLCST
125+
document: one or more characters. Note that its `type` property
126+
is `TextNode`, but it is different from the asbtract [`Text`][text]
127+
interface.
113128

114-
```
115-
interface WhiteSpaceNode < SymbolNode {
116-
type: "WhiteSpaceNode";
129+
```idl
130+
interface TextNode < Text {
131+
type: "TextNode";
117132
}
118133
```
119134

120-
### SourceNode
135+
## List of Utilities
121136

122-
Source ([Text](#text)) represents an external (ungrammatical) value embedded into a grammatical unit: a hyperlink, a line, and such.
137+
<!--lint disable list-item-spacing-->
123138

124-
```
125-
interface SourceNode < Text {
126-
type: "SourceNode";
127-
}
128-
```
139+
- [`wooorm/nlcst-is-literal`](https://github.com/wooorm/nlcst-is-literal)
140+
— Check whether a node is meant literally;
141+
- [`wooorm/nlcst-normalize`](https://github.com/wooorm/nlcst-normalize)
142+
— Normalize a word for easier comparison;
143+
- [`wooorm/nlcst-search`](https://github.com/wooorm/nlcst-search)
144+
— Search for patterns in an NLCST tree;
145+
- [`wooorm/nlcst-to-string`](https://github.com/wooorm/nlcst-to-string)
146+
— Stringify a node;
147+
- [`wooorm/nlcst-test`](https://github.com/wooorm/nlcst-test)
148+
— Validate a NLCST node;
129149

130-
### TextNode
150+
In addition, see [**Unist**][unist] for other utilities which
151+
work with **retext** nodes.
131152

132-
Text ([Text](#text)) represents actual content in an NLCST document: one or more characters.
153+
## License
133154

134-
```
135-
interface TextNode < Text {
136-
type: "TextNode";
137-
}
138-
```
155+
MIT © Titus Wormer
139156

140-
## Related
157+
<!--Definitions-->
141158

142-
- [retext](https://github.com/wooorm/retext) — Analyse and Manipulate natural language, 20+ plug-ins.
143-
- [parse-latin](https://github.com/wooorm/parse-latin) — Transforms latin-script natural language into a CST;
144-
- [TextOM](https://github.com/wooorm/textom) — Provides an object-oriented manipulation interface to NLCST;
145-
- [nlcst-to-string](https://github.com/wooorm/nlcst-to-string) — Transforms a CST into a string;
146-
- [nlcst-to-textom](https://github.com/wooorm/nlcst-to-textom) — Transforms a CST into a [TextOM](https://github.com/wooorm/textom) object model;
147-
- [nlcst-test](https://github.com/wooorm/nlcst-test) — Validate an NLCST node.
159+
[unist]: https://github.com/wooorm/unist
148160

149-
## License
161+
[retext]: https://github.com/wooorm/retext
150162

151-
MIT © Titus Wormer
163+
[parent]: https://github.com/wooorm/unist#parent
164+
165+
[text]: https://github.com/wooorm/unist#text
166+
167+
[symbol]: #symbol

0 commit comments

Comments
 (0)