Good examples. Lets do another one, more text oriented.
Say you want to classify if text is spam or not. Features could be each word. Then you can build additional features on top of those words, like:
- is header all caps?
- is text contain some particular keyword (viagra, enlargement...)
- are there typos?
- link to weird domains?
and much more. You can create tons of those features. Some would be helpful, some neutral, some will decrease quality of your model.
Core of the machine learning is creating those features, sending them to algo and measuring impact. Choosing algo is actually smallest portion of your job (on most projects)
Say you want to classify if text is spam or not. Features could be each word. Then you can build additional features on top of those words, like: - is header all caps? - is text contain some particular keyword (viagra, enlargement...) - are there typos? - link to weird domains?
and much more. You can create tons of those features. Some would be helpful, some neutral, some will decrease quality of your model.
Core of the machine learning is creating those features, sending them to algo and measuring impact. Choosing algo is actually smallest portion of your job (on most projects)