fix: handle nested formatting markers in markdownToSignal #1

Merged
abdo merged 1 commit from fix/nested-formatting-markers into main 2026-04-10 02:49:06 -04:00
Owner

Summary

Fix markdownToSignal so nested formatting markers (e.g., inline code inside bold, italic inside bold, triple-asterisk bold+italic, inline code inside a header) no longer leak literal marker names (BOLD / ITALIC / MONO) and SOH/STX control bytes into the visible Signal message body.

Type of Change

  • Bug fix
  • New feature
  • Refactoring
  • Documentation
  • Performance improvement
  • Test coverage

Changes Made

  • Replace the non-greedy regex marker extractor in src/delivery.ts with a stack-based walker that pushes a style frame on each opener and pops on each closer, producing overlapping BodyRanges that match Signal's protobuf semantics.
  • Drop stray markers silently instead of emitting them as control characters in the visible body.
  • Add four regression tests in tests/delivery.test.ts, each asserting the output contains no SOH (\u0001) or STX (\u0002) control characters.

Testing

  • Tests pass locally (npm test — 63/63 passing)
  • Manual testing completed
  • No new warnings (npm run lint clean)

Checklist

  • Commits follow Conventional Commits specification
  • Code follows existing patterns in the codebase
  • Self-review completed

Notes

The previous regex pass was non-greedy and closed the outer marker on the first inner closer, leaving the inner marker name and control bytes in the visible body whenever formatting was nested. The stack-based walker handles arbitrary nesting depth while preserving the overlapping byte-offset ranges that Signal's textStyle protobuf semantics require.

## Summary Fix `markdownToSignal` so nested formatting markers (e.g., inline code inside bold, italic inside bold, triple-asterisk bold+italic, inline code inside a header) no longer leak literal marker names (`BOLD` / `ITALIC` / `MONO`) and SOH/STX control bytes into the visible Signal message body. ## Type of Change - [x] Bug fix - [ ] New feature - [ ] Refactoring - [ ] Documentation - [ ] Performance improvement - [ ] Test coverage ## Changes Made - Replace the non-greedy regex marker extractor in `src/delivery.ts` with a stack-based walker that pushes a style frame on each opener and pops on each closer, producing overlapping `BodyRanges` that match Signal's protobuf semantics. - Drop stray markers silently instead of emitting them as control characters in the visible body. - Add four regression tests in `tests/delivery.test.ts`, each asserting the output contains no SOH (\u0001) or STX (\u0002) control characters. ## Testing - [x] Tests pass locally (`npm test` — 63/63 passing) - [ ] Manual testing completed - [x] No new warnings (`npm run lint` clean) ## Checklist - [x] Commits follow Conventional Commits specification - [x] Code follows existing patterns in the codebase - [x] Self-review completed ## Notes The previous regex pass was non-greedy and closed the outer marker on the first inner closer, leaving the inner marker name and control bytes in the visible body whenever formatting was nested. The stack-based walker handles arbitrary nesting depth while preserving the overlapping byte-offset ranges that Signal's `textStyle` protobuf semantics require.
The marker extractor was a non-greedy regex that closed the outer
marker on the first inner closer, leaving the inner marker name
(literal "BOLD" / "ITALIC" / "MONO") and SOH/STX control bytes in
the visible message body whenever formatting was nested. This hit
inline code inside bold (`**bold with `code` inside**`), italic
inside bold, triple-asterisk bold+italic, and inline code inside
a header.

Replace the regex pass with a stack-based walker that pushes a
style frame on each opener and pops on each closer, producing
overlapping BodyRanges that match Signal's protobuf semantics.
Stray markers are dropped silently rather than emitted as control
characters.

Add four regression tests for the nested cases, each asserting
the output contains no SOH or STX control characters.
Address PR review feedback:
- Replace magic number 6 with maxMarkerLen computed from styleNameMap
  keys so the bound stays correct if a marker name is ever added.
- Batch contiguous plain-text runs into single output.push(slice)
  calls instead of pushing one character at a time.
test: add chunk-boundary regression for nested overlapping styles
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
3c16e3e5f9
Exercises distributeStyles() clipping when overlapping ranges from
nested formatting (BOLD wrapping MONOSPACE) span a chunk split point.
Also removes unused FormattedText type import.
abdo force-pushed fix/nested-formatting-markers from 3c16e3e5f9
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
to 94376f436f
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2026-04-10 01:29:34 -04:00
Compare
abdo force-pushed fix/nested-formatting-markers from 94376f436f
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
to bcb5ceefbe
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2026-04-10 01:37:12 -04:00
Compare
abdo force-pushed fix/nested-formatting-markers from bcb5ceefbe
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
to 7b97edd43f
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2026-04-10 01:37:47 -04:00
Compare
abdo merged commit 7b97edd43f into main 2026-04-10 02:49:06 -04:00
abdo deleted branch fix/nested-formatting-markers 2026-04-10 02:49:06 -04:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
abdo/signal-mcp-bridge!1
No description provided.