“Optimal Peanut Butter and Banana Sandwiches”, 2020-08-25 ():
So, how do we make optimal peanut butter and banana sandwiches? It’s really quite simple. You take a picture of your banana and bread, pass the image through a deep learning model to locate said items, do some nonlinear curve fitting to the banana, transform to polar coordinates and “slice” the banana along the fitted curve, turn those slices into elliptical polygons, and feed the polygons and bread “box” into a 2D nesting algorithm.
…I used a pretrained Mask-RCNN torchvision model with a Resnet backbone. The model was pretrained on the MS-COCO dataset, and thankfully the dataset has “banana” as segmentation category, along with “sandwich” and “cake” which were close enough categories for suitable detection of most slices of bread.
…Because there could be multiple bananas and slices of bread in the image, I pick out the banana and slice of bread with the highest score.
…Using the wonderful scikit-image library, I first calculate the skeleton of the banana segmentation mask. This reduces the mask to a one pixel wide representation which effectively creates a curve that runs along the long axis of the banana…I then fit a circle to the banana skeleton using a nice scipy-based least squares optimization
…Rad Coordinate Transformations: With the circle fit to the banana, the goal is to now draw radial lines out from the center of the circle to the banana and have each radial line correspond to the slice of a knife…We are now able to orient ourselves angularly with respect to the center of the banana and radially in terms of the start and end of the banana along the radial line. The last step is the find the angular start and end of the banana, where the angular start will correspond to the angle pointing to the stem of the banana
…Finally, with this odd matrix above that represents this polar world warped onto a cartesian plot, we can identify both the banana stem and the opposite end of the banana which houses its seed. I find the two ends of the banana using a similar method to earlier for finding the radial start and end of the banana. I then find the average mask intensity in a region around either end of the banana and assume that the stem has a smaller average intensity. Finally, I virtually “chop off” the stem using the knowledge that the seed side of the banana should have similar average intensity to the stem side sans stem.
…We now have to make two assumptions about the banana slices. Firstly, we know that the banana slices will be smaller than the ones shown above because the peel has finite thickness. Secondly, bananas are not perfectly circular, and the slices will come out as ellipses. Based on a couple poor measurements with a tape measure (I don’t have calipers), I assume that the actual banana slices are 20% smaller than the image above with the banana peel. I also take the slices in the image above to represent the major axis of the banana slice ellipse, and assume that the minor axis is 85% the size of the major axis.
…By the time I finally got to the point of having polygonal, ellipsoidal banana slices extracted from an image and a nice bread box, I thought I would be home free…It turns out that this type of problem commonly called “nesting” or “packing” is extremely hard. Like, NP-hard. Surprisingly, this is a popular research areas because there are a whole bunch of applications…In the end, I got about halfway to a solution before stumbling upon nest2D which provides python bindings for a C++ library.
…nannernest: As mentioned at the beginning, I built a package called nannernest for you to make your own optimal peanut butter and banana sandwiches. Once you get the package installed, you can generate your own optimal sandwiches on the command line with:
$ nannernest pic_of_my_bread_and_banana.jpg
View HTML: