Hacker Newsnew | past | comments | ask | show | jobs | submit | sasquire's commentslogin

A useful overview. Most categorical data is ignored or, at best, encoded.


Almost all surveymonkey type forms wind up collecting a shedload of categorical data which is then pounded with jackhammers into specious numerical form, to theorise about.

Managing categorical data is pretty much about how long you've been playing with it. I can think of a lot of past SQL schema which used typed fields, and our entire working days could be filled by agonising conversations about when to change from a restricted type of categories, to a freeform field.

Dewey and the like has surprising outcomes. The Name of the Rose Models the complexity of Borges Library in 3-space, but everyone knows that there are alternative L-space forms of relationship between works, which defy simple typing.


Should be working now.


Log data processing is a pretty generic descriptor. Calling out specific workloads would be great, as would including 3rd-party benchmarks.



Third party eval with specific job


Never again a run to Fry's for an obscure cable or connector that turns into a full-blown spree! A fun way to waste a few hours, gone.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: