Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, there are definitely still places the samples fall short! Keep in mind we're still using very naive sampling techniques.

RE Winograd: WNLI is different, see https://arxiv.org/pdf/1804.07461.pdf



Amazing results, how excited are you? :)

You're right, I noted too that the comparison isn't direct but then, I wasn't justified in calling out the gap claim as wrong, so sorry for that. I think it'd be nice however, to have it undergo an external or more neutral test of performance. I say this without at all doubting the quality of the results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: