Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The thing is, I've worked extensively with object detection and segmentation neural nets for the past three years: they would barf all over the place in this scenario. You can't train for enough of the types of terrain obstacles encountered in such a wide-open range of outdoor spaces, and you certainly couldn't do it fast enough with any degree of accuracy, even with a 3080 GPU, let alone an embedded Xavier-class NVidia module. A dinky int8 quantized "tree/no-tree" wouldn't stand a chance.

This is why I asked for more details. They seem to indicate training, which today means neural net and not expert systems of yore. But that one be one hell of an object detector!! My question still stands.



Well, my understanding is that they are not training object detectors or segmentation models. They wouldn't be very useful anyway, you still need to have a 3d understanding of the scene and going from 2d->3d mapping wouldn't cut it. What they are doing instead is using stereo cameras, they are estimating the depth map of the field. Their model takes two rgb images, and produces a depth map. They combine this depth information with sensors on the drone such as acceleration etc and try to predict what an expert drone agent trained on perfect information in simulation would do. They train in simulation, restrict some of the information to the student agent and have it rely on only stereo cameras and sensors like in the world to mimic the 'privileged drone agent'. Computation wise, if you can run the depth estimation network on the hardware, remaining steps(given the depth map and sensor information, predict privileged drone agent path/vector), should be trivial and most likely are shallow networks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: