Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
7cb14ee
DEV: Replace String returns with Conversion/Parse Data types
claude May 5, 2026
fbb3e1e
DEV: Rename MediaWiki inline_tag_registry: kwarg to handlers:
claude May 5, 2026
58b3f2a
DEV: Add discourse_renderer factory and renderer: kwarg
claude May 5, 2026
58883b1
DEV: Add emit/EmissionBuffer for Tag side-data
claude May 5, 2026
105e1a5
DEV: Add overlay on BBCode/HTML/TextFormatter HandlerRegistry
claude May 5, 2026
c5453e3
DEV: Plumb processor: through TextFormatter handlers; accept lambdas
claude May 5, 2026
b4c008b
DEV: Track unknown HTML-like tags in MediaWiki parser
claude May 5, 2026
e7019e7
DEV: Add Markbridge.convert(format:) dispatcher and .render
claude May 5, 2026
7663b3c
DEV: Extract Markdown cleanup into Renderers::Discourse::Postprocessor
claude May 5, 2026
a3c7e5c
DEV: Make BBCode RawHandler tolerant of AST classes without language:
claude May 5, 2026
6b96dac
DEV: Add raise_on_error: kwarg and Conversion#errors
claude May 5, 2026
38514ac
DEV: Add UPGRADING.md and forum_migration example; refresh docs
claude May 5, 2026
7cad003
DEV: Reformat with syntax_tree to satisfy stree check
claude May 5, 2026
9b66d01
DEV: Simplify Renderer emit machinery; mutation-coverage prep
claude May 5, 2026
76e9053
DEV: Mutation coverage — Markbridge.* convenience + parse methods
claude May 5, 2026
30d781e
DEV: Mutation coverage — convert/render/discourse_renderer + emit infra
claude May 5, 2026
a0e9329
DEV: Reformat with syntax_tree (lint fix)
claude May 5, 2026
f6e6ba9
DEV: Mutation coverage — overlay, RawHandler, MediaWiki parse, TableTag
claude May 5, 2026
ec87adc
DEV: AST mutation between parse and render
claude May 6, 2026
0a86b3b
DEV: MarkdownEscaper#allow: + IdentityEscaper + escape: false sugar
claude May 6, 2026
bc84115
DEV: Mutation coverage — pin escape_hard_line_breaks default
claude May 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
255 changes: 255 additions & 0 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
# Upgrading Markbridge

## 0.x — migration-API redesign

This release reshapes the top-level API around `Conversion`/`Parse`
result types and a single `renderer:` kwarg for render-side
customization. There is no backwards-compatibility shim — the changes
are mechanical but every importer call site needs to be updated.

### Convenience methods now return a `Conversion`, not a `String`

```ruby
# Before
markdown = Markbridge.bbcode_to_markdown(input)
markdown.gsub(/.../, "...") # String operation

# After
result = Markbridge.bbcode_to_markdown(input)
result.markdown.gsub(/.../, "...") # explicit access

# Or, if you only need the string for puts/interpolation:
puts result # to_s delegates to markdown
"got #{result}" # works
```

`Conversion` carries `markdown`, `ast`, `format`, `unknown_tags`,
`diagnostics`, `emissions`, `errors`. It does *not* delegate other
String methods — `result.gsub(...)` will raise `NoMethodError`. Use
`result.markdown.gsub(...)`.

### Singleton config and per-process default registries are gone

The following are removed:

- `Markbridge.configuration`
- `Markbridge.configure { |c| c.escape_hard_line_breaks = ... }`
- `Markbridge.reset_defaults!`
- `Markbridge.default_handlers`
- `Markbridge.default_html_handlers`
- `Markbridge.default_text_formatter_handlers`
- `Markbridge.default_tag_library`
- `Markbridge::Configuration` (the class)

To customize rendering, build a `Renderer` once via the new factory
and pass it through `renderer:`:

```ruby
# Before
Markbridge.configure { |c| c.escape_hard_line_breaks = true }
Markbridge.default_tag_library.register(MyAst::Bold, MyTag.new)
Markbridge.bbcode_to_markdown(input)

# After
RENDERER =
Markbridge.discourse_renderer(
tags: { MyAst::Bold => MyTag.new },
escape_hard_line_breaks: true,
)
Markbridge.bbcode_to_markdown(input, renderer: RENDERER)
```

Build the renderer once outside your migration loop and reuse it
across thousands of posts; the no-emit path adds zero overhead.

### `tags:`, `tag_library:`, `escaper:`, `escape_hard_line_breaks:` removed from per-call signature

All four moved into `Markbridge.discourse_renderer(...)`. The four
`*_to_markdown` methods plus `Markbridge.convert` now accept only:

- `handlers:` — parser handler registry
- `renderer:` — pre-built Renderer
- `raise_on_error:` — boolean (default `true`)

### MediaWiki kwarg renamed: `inline_tag_registry:` → `handlers:`

```ruby
# Before
Markbridge.parse_mediawiki(input, inline_tag_registry: my_registry)
Markbridge::Parsers::MediaWiki::Parser.new(inline_tag_registry: my_registry)

# After
Markbridge.parse_mediawiki(input, handlers: my_registry)
Markbridge::Parsers::MediaWiki::Parser.new(handlers: my_registry)
```

The accepted *type* is unchanged — still an `InlineTagRegistry`. Only
the parameter name moves, for parity with the BBCode/HTML/TextFormatter
parsers.

### TextFormatter handlers must accept `processor:`

`Parsers::TextFormatter::Handlers::BaseHandler#process` now has a
three-arg signature:

```ruby
# Before
def process(element:, parent:)

# After
def process(element:, parent:, processor: nil)
```

Update every custom subclass under your importer's TextFormatter
handler tree. The `processor:` argument is the parser instance and
exposes `process_children(xml_element, ast_node)` for handlers that
want to recurse into children manually.

Lambda handlers now receive the same kwargs:

```ruby
registry.register("CUSTOM", ->(element:, parent:, processor:) { ... })
```

### Tag side-data: use `interface.emit` instead of mutating ctor-injected hashes

The textbook before/after for importers' Tags that build placeholders:

```ruby
# Before
class UrlTag < Markbridge::Renderers::Discourse::Tag
def initialize(placeholders:)
@placeholders = placeholders
end

def render(element, interface)
link = build_link(element)
@placeholders[:links] << link # mutates ctor-injected array
link[:placeholder]
end
end
# Importer pre-allocates @placeholders, passes to Tag, reads it after.

# After
class UrlTag < Markbridge::Renderers::Discourse::Tag
def render(element, interface)
link = build_link(element)
interface.emit(:link, link) # routed to Conversion#emissions
link[:placeholder]
end
end
# Importer reads: result.emitted(:link).each { |l| ... }
```

Pure lookup tables (`uploads:`, `repository:`) injected into Tag
constructors are still fine — only *mutation during render* migrates
to `emit`.

### `RawHandler` no longer requires `language:` on the AST class

`Markbridge::Parsers::BBCode::Handlers::RawHandler` used to call
`@element_class.new(language:)` unconditionally. Custom AST classes
reused with `RawHandler` had to declare a `language:` kwarg even when
unused. Now the handler introspects the AST class once and only passes
`language:` when the class accepts it. No code action needed unless
you'd previously added a dummy `def initialize(language: nil); super(); end`
just to satisfy the handler — you can remove it.

### Selective Markdown escaping (`allow:`)

Importers that want list markers (or other block-level constructs)
to survive escaping no longer need to subclass `MarkdownEscaper`:

```ruby
# Before
class ListPermissiveEscaper < Markbridge::Renderers::Discourse::MarkdownEscaper
private
def escape_block_level(content, prev_was_paragraph)
case content.getbyte(0)
when 0x2D, 0x2A, 0x2B then return content, false if content.match?(/\A[-*+]\s/)
when 0x30..0x39 then return content, false if content.match?(/\A\d+[.)]\s/)
end
super
end
end
RENDERER = Markbridge.discourse_renderer(escaper: ListPermissiveEscaper.new)

# After
RENDERER = Markbridge.discourse_renderer(allow: :lists)
```

Recognised keys: `:bullet_list`, `:ordered_list`, `:atx_heading`,
`:block_quote`. Aliases: `:lists` → `[:bullet_list, :ordered_list]`.
Unknown keys raise `ArgumentError`. Thematic breaks (`---`, `***`)
and setext underlines (`===`) are still escaped — the kwarg
allow-lists specific block markers, not whole sections of the
escaper.

### Disabling Markdown escaping wholesale

For migration paths where the source content is already trusted
Markdown:

```ruby
NO_ESCAPE = Markbridge.discourse_renderer(escape: false)
Markbridge.bbcode_to_markdown(input, renderer: NO_ESCAPE)
```

Internally this swaps in `Markbridge::Renderers::Discourse::IdentityEscaper`
(a tiny `#escape(text) → text || ""` class). `escape: false` is
mutually exclusive with `escape_hard_line_breaks:` / `allow:` —
those configure `MarkdownEscaper`, which `escape: false` replaces
wholesale. An explicit `escaper:` always wins over either.

For *per-AST-node* opt-out, `AST::MarkdownText` already exists and
bypasses the escaper for that node only.

### Modifying the AST between parse and render

Two new shapes let you mutate the parsed AST before rendering, e.g.
to append attachments that weren't in the source post:

```ruby
# Block form on every *_to_markdown / convert method
Markbridge.bbcode_to_markdown(input, renderer: RENDERER) do |ast|
attachments.each { |a| ast << OrphanAttachment.new(source_id: a.id) }
end

# Or pass a Parse explicitly to .render
parse = Markbridge.parse_bbcode(input)
parse.ast << OrphanAttachment.new(source_id: 7)
result = Markbridge.render(parse, renderer: RENDERER, raise_on_error: false)
# result.unknown_tags / .diagnostics / .format are preserved from the Parse.
```

`Markbridge.render` accepts either a `Parse` (preferred — preserves
`unknown_tags`/`diagnostics`/source `format`) or a bare AST node
(fields default to empty / `:discourse`). Mutations made between
parse and render persist in `Conversion#ast`.

### Per-row failure isolation

For migration loops, set `raise_on_error: false` to surface render
exceptions on `Conversion#errors` instead of crashing the loop:

```ruby
posts.each do |post|
result = Markbridge.bbcode_to_markdown(post.body, renderer: RENDERER, raise_on_error: false)
if result.errors.any?
log_failure(post, result.errors)
else
write_markdown(post, result.markdown)
end
end
```

The default is still `raise_on_error: true`, preserving the prior
behavior of letting exceptions propagate.

### See also

- `examples/forum_migration.rb` — canonical end-to-end importer shape
exercising every new path: `discourse_renderer` factory, `tags:`,
`unregister:`, custom escaper, `interface.emit`, `Conversion#emissions`,
`raise_on_error: false`, `Markbridge.convert(format:)` dispatch.
- `docs/extending.md` — how to add custom tags and handlers.
20 changes: 20 additions & 0 deletions docs/extending.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,26 @@ def self.default
end
```

### Auto-passthrough for unregistered AST classes

A custom AST class that has *no* Tag bound to it doesn't need a
"passthrough" Tag — `Renderer#render` falls through to
`render_children` automatically (see `lib/markbridge/renderers/discourse/renderer.rb`).
You only need to register a Tag when the class needs a non-trivial
rendering. To remove a built-in binding so this passthrough kicks in,
use `TagLibrary#unregister`:

```ruby
library.unregister(AST::Color) # Color now renders as just its children
library.unregister(AST::Size) # Size too
```

Or, more concisely, via the `Markbridge.discourse_renderer` factory:

```ruby
Markbridge.discourse_renderer(unregister: [AST::Color, AST::Size])
```

### Step 6: Add Requires

**File:** `lib/markbridge/ast.rb`
Expand Down
2 changes: 1 addition & 1 deletion docs/parsers/mediawiki.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ ast = parser.parse("<mark>highlighted</mark>")
registry = Markbridge::Parsers::MediaWiki::InlineTagRegistry.build_from_default do |r|
r.register("mark", :formatting, Markbridge::AST::Bold)
end
parser = Markbridge::Parsers::MediaWiki::Parser.new(inline_tag_registry: registry)
parser = Markbridge::Parsers::MediaWiki::Parser.new(handlers: registry)
```

### Via Top-Level API
Expand Down
30 changes: 5 additions & 25 deletions docs/renderers/discourse.md
Original file line number Diff line number Diff line change
Expand Up @@ -621,31 +621,11 @@ end

## Configuration

### Global Configuration

Use `Markbridge.configure` to set options that apply to all `*_to_markdown` convenience methods:

```ruby
Markbridge.configure do |config|
# Strip trailing spaces before newlines to prevent hard line breaks (<br/>).
# Defaults to false (Discourse has this disabled by default).
config.escape_hard_line_breaks = true
end

Markbridge.bbcode_to_markdown("[b]Hello[/b]") # uses configured settings
```

You can also read the current configuration:

```ruby
Markbridge.configuration.escape_hard_line_breaks # => false (default)
```

Available settings:

| Setting | Default | Description |
|---------|---------|-------------|
| `escape_hard_line_breaks` | `false` | Strip trailing spaces before newlines to prevent `<br/>` |
Markbridge has no global configuration. Render-side options (custom
escaper, custom Tags, custom postprocessor) are passed per call via a
configured `Renderer`. The escaper and postprocessor will become
configurable in a follow-up step of the API redesign; meanwhile, the
default Renderer is used for every convenience-method call.

### Using Default Library

Expand Down
Loading