HACKER Q&A
📣 tacone

Shouldn't we start ti syntax highlight natural language?


We skim over walls and walls of text every day. Why not use syntax coloring to give more visual clues to the reader?

Would it be a positive thing? How would you do that? Would any dark pattern emerge?


  👤 PaulHoule Accepted Answer ✓
Personal Information Protection systems can highlight things that are likely to be person's names or credit card numbers.

You could color statements that have been approved by the board (w/ metadata) or parts where it seemed the writer was angry, or something lewd/indecent, or written in Dutch, etc.

You can't put "parts of speech" tags on all words and show it to people because the whole Chomsky/Pinker approach to linguistics is a failure in terms of engineering computer systems that understand text.

A typical English sentence has thousands of valid parses according to that framework; the "most likely parse" could be wrong more than half the time. "Squad Helps Dog Bite Victim" is a good joke, but it's not funny when it happens to you, putting the wrong coloring on a sentence like that makes you look like a dork, makes the reader lose empathy with you.

For a long time (1970s) there have been "magic magic marker" methods such as hidden markov model and conditional random fields. The modern neural methods are even better than the old methods.

If you have a realistic goal and the faith and determination to mark up 20,000 or so sentences you can train a model of that sort to mark up text the way you do that people might accept -- having the training set is more essential than having the latest algorithm


👤 mrspeaker
Ha ha, you'd get a squiggly line under that "ti" in your title!

I messed around with the idea of a "syntax highlighter for writers" a long time ago. https://www.mrspeaker.net/2012/03/24/syntax-highlighting-for... - Grammar highlighting didn't seem that interesting in the end though. Here's a "demo" of it: https://www.mrspeaker.net/dev/syntx/

Maybe smarter highlighting would be more useful.


👤 ktpsns
We do semantic highlighting, it is called "markup", such as italic, bold or underlined text. This idea is much older then syntax highlighting and I would guess it's where computer language highlighting originates from.

👤 muzani
Sarcasm would be nice, but I don't think that's possible.

One thing I'd like to see is the 'pillar' of the text highlighted. People often write a paragraph, page, or chapter around one proposition or argument. Whatever search engines do, it's working, but it's not used here.

Kindle does it well by informing you of commonly highlighted text.


👤 codingdave
What problem are you trying to solve? This question sounds like a solution looking for a problem... but if there is some specific problem with reading that you are trying to correct, could you let us know what that problem is?