Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems way more error prone than just using an HTML parser and removing script tags.

If you use a syntactical parser, you can guarantee your output is safe, all the time. If you're using non-syntactical techniques, then you have to go and find all these edge cases that you should crash on in the first place.



Just for anyone reading this: there is no safe way to de-fang html with a blacklist. You can only safely allow certain tags and attributes. Many people have fallen on this sword.


In a browser context, the built-in DOMParser (perhaps wrapped in dompurify) is the way to do this. No need to look elsewhere.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: