The paper describes it pretty well in appendix C. A matrix of integrators is constructed with a bunch of opamps, RC time constants (using digital potentiometers, presumably) and a multichannel ADC/DAC interface to the PC. Essentially a dedicated differential-equation solver.
So it's a combination of old-school analog computation and modern GPU-based code. Takes longer in practice due to the overhead of interfacing with the hardware and waiting for the integrators to settle, but the authors are claiming that an optimized implementation could outperform a purely-digital solution, as I understand it, by accelerating convergence.
The core idea being that conventional gradient descent is a linear operation at heart, while the gradients actually being traversed are curved surfaces that have to be approximated with multiple unnecessary steps if everything is done in the digital domain.
The trouble, as everybody from Seymour Cray onward has learned the hard way, is that CMOS always wins in the end, simply because the financial power of an entire industry goes into optimizing it.
First author of the paper here. That's it indeed! One thing is that this is entirely CMOS-compatible. You could also do something similar with optics or other platforms, but we chose electronic circuits for this reason specifically.
By that remark I meant "digital CMOS," in the sense of elements that store state information discretely in flip-flops or gate insulators rather than continuously with analog integrators.
Very cool work in any event, though! Best of luck with the ongoing R&D.
I didn't realize they included details about the hardware. Lie you said these just look like analog computers, compute in memory, analog arrays, which have also made a resurgence with deep leaning.
Yes, digital wins over analog because of all the money that went into digital. I am wondering if one could create a digital analog computer by using pwm instead of analog signals.
The whole point is to leverage the laws of nature to train AI models, overcoming the limitations and scaling challenges of digital hardware and existing training methods.
I believe one example would be quantum annealers. Where "programming" involves setting the right initial conditions and allowing thermodynamics to bring you to an optimum via relaxation.
This could be attractive if they can build a product along those lines: tens, if not hundreds, of billions of dollars are spent yearly on numerical optimization worldwide, and if this can significantly accelerate it, it could be very profitable.
Analog computers have a lot of history. You can Google analog with neural network or differential equations to get many results. They are fast with low power, can have precision issues, and require custom, chip design.
Mixed signal ASIC’s often use a mix of digital and analog blocks to get the benefits of analog. It’s especially helpful for anything that eats lots of power or to prevent that (eg mobile).
Hard to beat the string algorithm for finding shortest paths on a positive weights network (e.g. build the network out of string where topologies match and link lengths are link weights, find the origin and destination nodes/knots of interest, grab the two nodes and pull until taut).
Or the spaghetti approach to finding the largest value from a list of positive values (e.g. cut dry spaghetti noodles to length for each value, bundle them together and tap the bundle on a table, the one visually sticking out the most is the largest valued element).
Of course, we already need to be working in Spaghetti ints or prepping them will be as complex than a linear scan.
Can't wait for spaghetti arithmetic.
Do we have a better algo than log(n) for locating the min?
I'm thinking spaghetti max align them, lay them across an arm at the midway point, sweep the shorts that fell, and repeat until all remaining are same length.
> requires an analog thermodynamic computer
Wait. What?
Perhaps a trained physicist can comment on that. Thanks.