The decision reasoning is super helpful to put things in context. The arbitrary binary decision to "accept" vs "reject" especially for the snooty "high bar for acceptance at ICLR" is laughable in a world of free information access.
That sounds like a degenerate organisation. But if we're sharing anecdata: My pay in this Berlin startup is very decent, I have respectable equity and it's a friendly & rational environment. I'm scared of US startup work culture, rather.
I remember one Deep Learning startup in Berlin ran by ex-Googlers that was offering 85k salary, minuscule equity, and expected a Stanford top-graduate with interview that had to demonstrate complete mastery of all Stanford Deep Learning courses and probabilistic graphical models... I hope it's not your employer, you are likely undervaluing yourself 10x otherwise.
For a graduate that would be a huge salary in germany. Most engineers with 10years of experience won't receive that. Therefore having high expectations around those candidates seems right.
(That still does't mean I find the salary situation for engineers great in germany. Just that it seems to fit into the general scheme)
You probably need to factor healthcare, rent etc for a fair comparison to, say, the Bay area -- 85k affords a nice upper-upper middle class lifestyle (it scratches the 99th gross income percentile in germany). So 10x sounds like a stretch. What profile would you suggest someone who takes 200-800k € might have? (Tangentially, out of curiosity: Are you in a position to share what the ex-googlers worked on?)
That's too naive. There's next to nothing known about how activity in individual neurons and their synapses relates to mental contents, in particular "higher level" concepts and thought patterns that you're concerned with in adult learning. In other words, the network dynamics are complex and unknown, and so the suppression / deactivation of certain synapses may just as well be a normal and necessary part of learning as it may be a part of forgetting, or neither. There is currently no contender for a "neuroscience standard model" that would bridge between this kind of neural dynamics and cognitive functions. I hope to live to see one.
Amen to that, we are starting to get some of the groundwork but are far far away from a “standard model”, neuroscientists,psychiatrists and AI developers oversell their understanding in order to keep their jobs and funding.
Then, in June 2004, Steve Jobs announced that Apple was releasing its new operating system, called “Tiger.” And inside Microsoft, jaws dropped. Tiger did much of what was planned for Longhorn—except that it worked.
E-mails flew around Microsoft, expressing dismay about the quality of Tiger. To executives’ disbelief, it contained functional equivalents of Avalon and WinFS.
“It was fucking amazing,” wrote Lenn Pryor, part of the Longhorn team. “It is like I just got a free pass to Longhorn land today.”
Vic Gundotra, another member of the group, tried out Tiger. “Their Avalon competitor (core video, core image) was hot,” he wrote. “I have the cool widgets (dashboard) running on my MAC right now with all the effects [Jobs] showed on stage. I’ve had no crashes in 5 hours.”
Somehow they managed to turn it around enough that there's discussions in this thread about a windows surface and an iPad Pro. Which is something in itself.
They use evolutionary search to discover spiking neural networks whose response dynamics can solve a control task. This is a fascinating approach, but one that I've only ever seen as a means to do theoretical neuroscience: A way to obtain interesting spiking networks whose dynamics we can study in the hope of developing mathematical tools that will help understand biological networks.
But here, from the claims in the post and the lab website, it sounds as if the goal is in application: Creating better, more efficient controllers. This comes across as a little detached from the applied machine learning literature. At the least, I missed a comparison to reinforcement learning (which has a history of learning to solve this exact task with simpler controller designs and most likely shorter search times) and also to non-bio-inspired recurrent networks.
One more point: Even if I follow along with the claim that 'deep learning' approaches don't have memory (implying recurrent networks aren't included in that label), I want to point out that this particular task setup, with positions/angles as well as their rates of change provided, can be solved by a memoryless controller. It would have done more to highlight the strengths of the recurrent network approach if a partially observable benchmark task had been used, e.g. feeding positions and angles only. Much more difficult high-dimensional tasks e.g. in robotic control are tackled in the (deep) reinforcement learning literature among others.
Can you explain that a bit more? I'm fascinated but don't quite get the rationale behind having not even the authors of the paper know the results. What is their process?
It means you have to develop your experimental method prior to seeing the data, so you can't try a dozen different ways of slicing that data up until you get the result you want.
Yeah. This is what's absolutely missing in many scientific fields, and seems to be the reason for so many results turning out to be utter bullshit (hello, psychology). There's a strong push coming from research community to make this approach, which is known as 'study pre-registration', mandatory.
Presumably pre-registration could also help with positive results bias.
Assuming the registration was made public, one could mandate that something had to be produced from the study, if only a statement that nothing was found.
Wouldn't solve the problem but would presumably aid meta-studies?
It's not really feasible for most research (although there are plenty of needed improvements along these lines). Big physics projects are unusual in that they are enormous undertakings of 100s or 1000s of PhDs, and few or no other experiments will take data that can cross-check the results. (For instance, no other machine will produce the conditions at the LHC for several decades at least, which is why they go to the fantastic expense of building two completely separate general purpose detectors.) Devoting ~5 full-time PhDs for the sake of super-duper methodological rigor is doable for LIGO, but not for smaller experiments.
Well, keep in mind that if hundreds of scientists take their chance on the same data using different methods, we basically have the same problem. One of them will be the lucky one that took the approach that shows the favorable results, whether the results are truly there or not.
Are there any major conceptual differences to Theano? Not that I wouldn't appreciate a more polished, well funded competitor in the same space.
It looks like using TensorFlow from Python will feel quite familiar to a Theano user, starting with the separation of graph building and graph running, but also down into the details of how variables, inputs and 'givens' (called feed dicts in tensorflow) are handled.
I think this loop actually still only builds the graph -- what `scan` would do. The computation still happens outside of python. That is, in tensorflow they perhaps don't need `scan` because a loop with repeated assignments "just works"... Let's try this:
It seems like in TensorFlow you can say:
import tensorflow as tf
sess = tf.InteractiveSession() # magic incantation
state = init_state = tf.Variable(1) # initialise a scalar variable
states = []
for step in range(10):
# this seems to define a graph that updates `state`:
state = tf.add(state,state)
states.append(state)
sess.run(tf.initialize_all_variables())
at this point, states is a list of symbolic tensors.
now if you query for their value:
>>> import theano
>>> import theano.tensor as T
>>> state = theano.shared(1.0)
>>> states = []
>>> for step in range(10):
>>> state = state + state
>>> states.append(state)
>>>
>>> f = theano.function([], states)
>>> f()
[array(2.0),
array(4.0),
array(8.0),
array(16.0),
array(32.0),
array(64.0),
array(128.0),
array(256.0),
array(512.0),
array(1024.0)]
Thanks! When I tried this before, I thought compilation was stuck in an infinite loop and gave up after about a minute. But you're right, it works. Though on my machine, this took two and a half minutes to compile (ten times as long as compiling a small convnet). For 10 recurrence steps, that's weird, right? And the TensorFlow thing above runs instantly.
one more helpful criticism: the 'epic' music in your video, for me at least, evokes the opposite of a zen-like, concentrated state. it brings up images of action movies, war scenes or documentaries about bridges.
Thanks. You got us there and you are completely right. Truth is we at SmartCodeHQ are complete suckers when dramatic video shows up. What can I say, this was one of the examples we looked at: https://www.youtube.com/watch?v=hSye3T6FQZs