Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

That one hits me in the feels because I think a lot of folks focus on algorithms (including myself), and code patterns, before their data and as a result a lot of things end up being harder than they need to be. I've always liked this quote from Torvalds on the subject speaking on git's design (first line is for some context):

> … git actually has a simple design, with stable and reasonably well-documented data structures.

then continues:

> In fact, I'm a huge proponent of designing your code around the data, rather than the other way around, and I think it's one of the reasons git has been fairly successful […] I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

When I have good data structures most things just sort of fall into place. I honestly can't think of a time where I've figuratively (or literally) said "my data structure really whips the llamas ass" and then immediately said "it's going to be horrible to use." On the contrary, I have written code that is both so beautiful and esoteric, its pedantry would be lauded for the ages-- had only I glanced over at my data model during my madness. No, instead, I awaken to find I spent my time quite aptly digging a marvelous hole, filling said hole with shit, and then hopping in hoping to not get shitty.

One thing that really has helped me make better data structures and models is taking advanced courses on things like multivariate linear regression analysis specifically going over identifying things like multicolinearity and heteroskedasticity. Statistical tools are incredibly powerful in this field, even if you aren't doing statistical analysis everyday. Making good data models isn't necessarily easy, nor obvious, and I've watched a lot of experienced folks make silly mistakes simply because they didn't want something asinine like two fields instead of one.



The counter argument would be that git is the poster-child of poor UX, which could be blamed on the fact that it exposes too much of its internal data structure and general inner-workings to the user.

I.e. too much focus has been put on data structures and not enough on the rest of the tool.

A less efficient data structure, but more focus on UX could have saved millions of man hours by this point.


It's difficult, because git's exposition of it's data structures enables you to use it in ways that would not otherwise be possible.

I think git is more of a power-tool than people sometimes want it to be. It's more like vi than it is like MS Word, but it's ubiquity makes people wish it had an MS-word mode.

So, I think that it's hard to fault git's developers for where it is today. It's a faithful implementation of it's mission.

FWIW, I have never used a tool with better documentation than git in 2020 (it hasn't always had good --help documentation, but it absolutely does today).


Or perhaps learning Git just requires a different approach: you understand the model first, not the interface. Once you understand the model (which is quite simple), the interface is easy.


People keep repeating this, but it's not true. The interface has so many "this flag in this case" but "this other flag in that case" and "that command doesn't support this flag like that" etc. There's no composability or orthoganality or suggestiveness. It's nonsensical and capricious and unmemorable, even though I understand the "simple" underlying model and have for years.


Sorry, I was replying to this:

> The counter argument would be that git is the poster-child of poor UX, which could be blamed on the fact that it exposes too much of its internal data structure and general inner-workings to the user.

I agree with you that the UI is inconsistent, however I don't agree that it's the result of git exposing too much of the internal data structure.


Has anyone attempted to re-engineer a superior UX on top of the git data structure? Would it even be possible?


Yes, I think so. There are many git clients which offer superior UX already, but they only provide a subset of the functionality available with the data structure. I'd personally love to experiment with showing and editing the data structure more 'directly', instead of relying on a battery of CLI commands and options.


Magit with emacs solves git's UX problem IMO. Discoverability is/was git's real problem.


This is true, but the trouble is that you need to know what git will do before the magit commands and options make sense.


It makes sense, when we bring in another aphorism "code is data". It's easier to write good code with good libraries. And it's easier to write good data models that extend good data models. The main distinction is that code is very dynamic, flexible, and malleable, whereas data models need not be.

Data models are the "bones" of an application, as part of the application as code is. Data models fundamentally limit the application's growth, but if they're well-placed, they can allow you to do things that are really powerful.

You always want to have good bones. But the Anna Karenina Principle is a thing [0].

So, applying this, I think baby ideas should not have many constraints on the bones, to allow them to move around in the future. Instead, there should be a ton of crap code implementing the idea's constraints, because they change every week, month, quarter, and the implementer is still learning the domain.

Once the implementer reaches a certain point of maturity in the domain, all of the lessons learned writing that crap code can be compressed into a very clever data model that minimizes the amount of "code" necessary, and simultaneously makes the project more maintainable, interface-stable, and extensible: in other words, making it an excellent platform to build on. The crap code can be thrown out, because it was designed to halfway-ensure invariants that the database can now take care of.

I think most software we consider "good" these days followed this development cycle. multics -> unix, <kversion_control> -> git, ed -> vi -> vim.


It's worth noting the same holds true for UI: data dominates. Design your widgets, layout, and workflow around the data.


> It's worth noting the same holds true for UI: data dominates. Design your widgets, layout, and workflow around the data.

I couldn't agree more.

I think the current state of UI programming is like the pathological case to be honest. Too often folks are concerned with representing their database 1-to-1 in their UI instead of representing their view.

If anyone is suffering from brittle UI code, where somehow caching issues and stale data are affecting your application, this is very likely why. You have muddled your persistence and view concerns together and it's not manageable or pretty. What this means for folks using something like React- don't directly use your persistence models in your views, create "view models" which directly represent whatever the hell it is you're trying to display. Bind your data in your view models, and not your views, and then pass the view model in as props.


Amen. My 23 years experience in webdev says React (the paradigm, not the lib per se) is dominating web UI precisely because it is all about unidirectional data flow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: