More

obstbraende · on May 2, 2021

here's a link with reviewer comments https://openreview.net/forum?id=PdauS7wZBfC (praise to openreview!)

pishpash · on May 2, 2021

The decision reasoning is super helpful to put things in context. The arbitrary binary decision to "accept" vs "reject" especially for the snooty "high bar for acceptance at ICLR" is laughable in a world of free information access.

obstbraende · on Nov 15, 2018

That sounds like a degenerate organisation. But if we're sharing anecdata: My pay in this Berlin startup is very decent, I have respectable equity and it's a friendly & rational environment. I'm scared of US startup work culture, rather.

bitL · on Nov 15, 2018

I remember one Deep Learning startup in Berlin ran by ex-Googlers that was offering 85k salary, minuscule equity, and expected a Stanford top-graduate with interview that had to demonstrate complete mastery of all Stanford Deep Learning courses and probabilistic graphical models... I hope it's not your employer, you are likely undervaluing yourself 10x otherwise.

Matthias247 · on Nov 15, 2018

For a graduate that would be a huge salary in germany. Most engineers with 10years of experience won't receive that. Therefore having high expectations around those candidates seems right.

(That still does't mean I find the salary situation for engineers great in germany. Just that it seems to fit into the general scheme)

obstbraende · on Nov 15, 2018

You probably need to factor healthcare, rent etc for a fair comparison to, say, the Bay area -- 85k affords a nice upper-upper middle class lifestyle (it scratches the 99th gross income percentile in germany). So 10x sounds like a stretch. What profile would you suggest someone who takes 200-800k € might have? (Tangentially, out of curiosity: Are you in a position to share what the ex-googlers worked on?)

obstbraende · on June 24, 2018

That's too naive. There's next to nothing known about how activity in individual neurons and their synapses relates to mental contents, in particular "higher level" concepts and thought patterns that you're concerned with in adult learning. In other words, the network dynamics are complex and unknown, and so the suppression / deactivation of certain synapses may just as well be a normal and necessary part of learning as it may be a part of forgetting, or neither. There is currently no contender for a "neuroscience standard model" that would bridge between this kind of neural dynamics and cognitive functions. I hope to live to see one.

soup10 · on June 25, 2018

Amen to that, we are starting to get some of the groundwork but are far far away from a “standard model”, neuroscientists,psychiatrists and AI developers oversell their understanding in order to keep their jobs and funding.

SubiculumCode · on June 25, 2018

I know of no psychologist or neuroscientist who even try for a 'standard model' level of understanding of brain function.

haskellandchill · on June 25, 2018

It’s really just the AI folks.

obstbraende · on June 5, 2017

do you have more info on that quote? I'd like to read more about that time, but google failed me

Eric_WVGG · on June 5, 2017

http://www.vanityfair.com/news/business/2012/08/microsoft-lo...

Then, in June 2004, Steve Jobs announced that Apple was releasing its new operating system, called “Tiger.” And inside Microsoft, jaws dropped. Tiger did much of what was planned for Longhorn—except that it worked.

E-mails flew around Microsoft, expressing dismay about the quality of Tiger. To executives’ disbelief, it contained functional equivalents of Avalon and WinFS.

“It was fucking amazing,” wrote Lenn Pryor, part of the Longhorn team. “It is like I just got a free pass to Longhorn land today.”

Vic Gundotra, another member of the group, tried out Tiger. “Their Avalon competitor (core video, core image) was hot,” he wrote. “I have the cool widgets (dashboard) running on my MAC right now with all the effects [Jobs] showed on stage. I’ve had no crashes in 5 hours.”

gumby · on June 6, 2017

> Vic Gundotra, another member of the group, ...wrote “...I’ve had no crashes in 5 hours.”

That's like a parody of what a Microsoft hater would say, yet it's an actual quote from Microsoft. Makes me feel sorry for them in retrospect.

intended · on June 6, 2017

Somehow they managed to turn it around enough that there's discussions in this thread about a windows surface and an iPad Pro. Which is something in itself.

obstbraende · on March 7, 2017

it doesn't seem to register key presses on chrome 56 (mac os)

obstbraende · on Jan 16, 2017

If we want to assign credit for the LSTM to one person, his student Hochreiter is perhaps the better pick

obstbraende · on Sept 11, 2016

They use evolutionary search to discover spiking neural networks whose response dynamics can solve a control task. This is a fascinating approach, but one that I've only ever seen as a means to do theoretical neuroscience: A way to obtain interesting spiking networks whose dynamics we can study in the hope of developing mathematical tools that will help understand biological networks.

But here, from the claims in the post and the lab website, it sounds as if the goal is in application: Creating better, more efficient controllers. This comes across as a little detached from the applied machine learning literature. At the least, I missed a comparison to reinforcement learning (which has a history of learning to solve this exact task with simpler controller designs and most likely shorter search times) and also to non-bio-inspired recurrent networks.

One more point: Even if I follow along with the claim that 'deep learning' approaches don't have memory (implying recurrent networks aren't included in that label), I want to point out that this particular task setup, with positions/angles as well as their rates of change provided, can be solved by a memoryless controller. It would have done more to highlight the strengths of the recurrent network approach if a partially observable benchmark task had been used, e.g. feeding positions and angles only. Much more difficult high-dimensional tasks e.g. in robotic control are tackled in the (deep) reinforcement learning literature among others.

obstbraende · on Jan 13, 2016

Can you explain that a bit more? I'm fascinated but don't quite get the rationale behind having not even the authors of the paper know the results. What is their process?

DavidSJ · on Jan 13, 2016

It means you have to develop your experimental method prior to seeing the data, so you can't try a dozen different ways of slicing that data up until you get the result you want.

marcusgarvey · on Jan 13, 2016

...and now knowing this, I wish it were the approach for _all_ scientific research.

TeMPOraL · on Jan 13, 2016

Yeah. This is what's absolutely missing in many scientific fields, and seems to be the reason for so many results turning out to be utter bullshit (hello, psychology). There's a strong push coming from research community to make this approach, which is known as 'study pre-registration', mandatory.

lucozade · on Jan 13, 2016

Presumably pre-registration could also help with positive results bias.

Assuming the registration was made public, one could mandate that something had to be produced from the study, if only a statement that nothing was found.

Wouldn't solve the problem but would presumably aid meta-studies?

jessriedel · on Jan 13, 2016

It's not really feasible for most research (although there are plenty of needed improvements along these lines). Big physics projects are unusual in that they are enormous undertakings of 100s or 1000s of PhDs, and few or no other experiments will take data that can cross-check the results. (For instance, no other machine will produce the conditions at the LHC for several decades at least, which is why they go to the fantastic expense of building two completely separate general purpose detectors.) Devoting ~5 full-time PhDs for the sake of super-duper methodological rigor is doable for LIGO, but not for smaller experiments.

amelius · on Jan 13, 2016

Well, keep in mind that if hundreds of scientists take their chance on the same data using different methods, we basically have the same problem. One of them will be the lucky one that took the approach that shows the favorable results, whether the results are truly there or not.

obstbraende · on Nov 9, 2015

Are there any major conceptual differences to Theano? Not that I wouldn't appreciate a more polished, well funded competitor in the same space.

It looks like using TensorFlow from Python will feel quite familiar to a Theano user, starting with the separation of graph building and graph running, but also down into the details of how variables, inputs and 'givens' (called feed dicts in tensorflow) are handled.

albertzeyer · on Nov 9, 2015

I'm looking at the RNN implementation right now (https://github.com/tensorflow/tensorflow/blob/master/tensorf...). It looks like the loop over the time frames is actually in Python itself.

    for time, input_ in enumerate(inputs): ...

This confuses me a bit. Maybe the shape is not symbolic but must be fixed.

I also haven't seen some theano.scan equivalent. Which is not needed in many cases when you know the shape in advance.

obstbraende · on Nov 9, 2015

I think this loop actually still only builds the graph -- what `scan` would do. The computation still happens outside of python. That is, in tensorflow they perhaps don't need `scan` because a loop with repeated assignments "just works"... Let's try this:

It seems like in TensorFlow you can say:

    import tensorflow as tf 
    sess = tf.InteractiveSession() # magic incantation

    state = init_state = tf.Variable(1) # initialise a scalar variable

    states = []
    for step in range(10):
         # this seems to define a graph that updates `state`:
         state = tf.add(state,state)
         states.append(state)

    sess.run(tf.initialize_all_variables())

at this point, states is a list of symbolic tensors. now if you query for their value:

    print sess.run(states)
    >>> [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

you get what you would naively expect. I don't think that would work in Theano. Cool.

benanne · on Nov 9, 2015

Why wouldn't this work in Theano?

    >>> import theano
    >>> import theano.tensor as T
    >>> state = theano.shared(1.0)
    >>> states = []
    >>> for step in range(10):
    >>>     state = state + state
    >>>     states.append(state)
    >>> 
    >>> f = theano.function([], states)
    >>> f()
    [array(2.0),
     array(4.0),
     array(8.0),
     array(16.0),
     array(32.0),
     array(64.0),
     array(128.0),
     array(256.0),
     array(512.0),
     array(1024.0)]

obstbraende · on Nov 9, 2015

Thanks! When I tried this before, I thought compilation was stuck in an infinite loop and gave up after about a minute. But you're right, it works. Though on my machine, this took two and a half minutes to compile (ten times as long as compiling a small convnet). For 10 recurrence steps, that's weird, right? And the TensorFlow thing above runs instantly.

benanne · on Nov 10, 2015

Agreed. Theano has trouble dealing efficiently with very deeply nested graphs.

yablak · on Nov 9, 2015

You're right. There is not currently a theano.scan equivalent that dynamically loops over a dimension of a tensor.

That said, you can do a lot with truncated BPTT and LSTM. See the sequence modeling tutorial on tensorflow.org for more details.

obstbraende · on Sept 5, 2015

one more helpful criticism: the 'epic' music in your video, for me at least, evokes the opposite of a zen-like, concentrated state. it brings up images of action movies, war scenes or documentaries about bridges.

binaryapparatus · on Sept 5, 2015

Thanks. You got us there and you are completely right. Truth is we at SmartCodeHQ are complete suckers when dramatic video shows up. What can I say, this was one of the examples we looked at: https://www.youtube.com/watch?v=hSye3T6FQZs