Code: Select all
But I considered you'd use both images for a disparity map, where OpenCV try to find a horizontal difference between the similar fragments on the left and right image.
I was at first. But then started to realize that it would be blending two techniques at the same time, which certainly isn't supported by OpenCV.
A have had an idea to use H264 encoder, as it compares 16x16 fragments between a sequence of images. You might know about motion vectors, calculated as a part of H264 encoding by Pi's hardware (here and here on YouTube). Unfortunately, it has some limitations and will not work for either depth map or optical flow. But the bonus is just a couple of percents of CPU load. May be I will come back to these experiments.
1280x720 is too high load for the Pi to get a real-time depth map. But in my experiment with "2D lidar-like mode" I used not all frame, but just a strip in the middle for the 640x480 image with ~30% height (instead of 320x240 for all frame). And got a good FPS. It means you can use a very small strip from 1280 image to get this.
I think you are onto something here. It's too bad the 16x16 chunking doesn't work natively with the depth map or optical flow. But it gave me an idea with respect to the flight realm (this might be a different topic).
A group is working on generating a real-time flight path vector from ardupilot (
https://diydrones.com/profiles/blogs/ai ... 3#comments).
Since the flight path vector represents the future (linear*) flight path of the aircraft, we know where in the FPV image is the most important place to look. Additionally, we know the aircraft velocity and responsiveness.
*not perfectly linear, but safe to approximate as such
Cropping the image around the flight path vector with the crop mask coupled to aircraft velocity and responsiveness, we can basically remove all of the extraneous background noise for "non-threat" obstacles. As the aircraft velocity increases, the cropped image is smaller and tighter around the flight path vector. As the velocity decreases, the cropped image increases in size until it is 100% of the original image.
Biologically we do the same thing when riding a bike...low speed focus a meter in front, high speed focus out on the horizon.
Computationally this might be too expensive for just one pi, but if you 'stream' the images from one single board computer to another, there might be a slight advantage. When slowing back down, you might run into problems since the small images won't compare to the large images.
I'm sure this has been implemented elsewhere since it is pretty intuitive, but your mention of the H264 encoder made me think, why not target the most important part of the image and dump the rest since processing power is the limiting factor.