Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is actually a remarkably good idea. Now if we only had a tool that could cope with such a format and parse sentences and paragraphs out of it. When I run my thesis through grammarly I had to remove all the superfluous line feeds for it to correctly determine the actual sentence structure (which it did then well enough for me to do this semi-automatically for 200 pages).


There's the usual markdown/reST/et al approach of strip newlines unless they are doubled up at a "paragraph" boundary.

When converting to/from Inform 7 where newlines are sometimes syntactically important, but I want to version control it (and sometimes write/edit it) a bit more "semantic newline", or at least "72-character wrapped", I have a tool converting "hard" newlines to/from Pilcrows [0] (¶). Which I think look reasonable in source document form and is an appropriate mark for the meaning (denoting a new paragraph).

An example: https://github.com/WorldMaker/APrincessOfMoons/blob/master/_...

The bones of my converter: https://github.com/WorldMaker/APrincessOfMoons/blob/master/i...

[0] https://en.wikipedia.org/wiki/Pilcrow


SGML (= superset of XML and HTML from 1986) does this via "short reference delimiters", which are context-dependent rules for interpreting custom sequences of chars such as double-newlines as eg. end-paragraph tags.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: