At Nervana, we wanted to share with you some of our work in the agriculture industry to help optimize crop yields, and overall operations. Improving the success of these crops will help solve the food shortage crisis as the human population grows exponentially. This is something that we can all be grateful for.
Blue River Technology is a leader in the emerging field of precision agriculture. They use cutting-edge computer vision and robotics to accurately measure and characterize crops and thereby allow farmers to make decisions plant-by-plant (Figure 1). This in-turn increases overall yield and promotes more sustainable farming practices. Blue River works on a wide range of applications like precision thinning that uses a real-time robotic platform to kill unwanted plants. In each of these applications, locating, identifying, and characterizing each crop is critical.1 When Blue River approached Nervana to help with phenotyping corn plants, they had an in-house system based on 3-D imaging that performed well but was thrown off by certain features such as weeds. The challenge for Nervana’s Deep Learning team was to bring the power of deep learning methods such as convolutional neural networks to improve the performance of the overall system.
Figure 1: Blue River robots at work
Convolutional neural networks have been around for almost three decades and have recently achieved a steep increase in performance thanks to the availability of larger datasets, faster computing resources enabling larger (deeper) models, and algorithmic innovations. Nervana has developed the world’s fastest implementations of deep learning models (benchmarks from Facebook and Baidu) and packaged them into an open source framework called neon. Nervana’s cloud platform (released publicly earlier this week) allows developers to quickly build, train and deploy deep learning models without needing to build our their own hardware and software infrastructure. Our deep learning team has been using the Nervana Cloud to help solve customers’ problems and has seen considerable gains in speed and ease of use when compared to tackling the same problems using other frameworks and cloud services.
While the academic literature has been on overdrive these last few years with state of the art results on academic datasets, less is known in the community about industrial applications of deep learning. Blue River provided us with 812×612 pixel images which contained zero, one or more corn plants, along with locations of points where the plant made contact with the ground (annotated using mechanical Turk; see Figure 3, left panel). Although any part of the plant could have served the counting objective, the precise locations of the contact points were needed for follow up robotic tasks. Contrary to standard object localization tasks such as PASCAL-VOC and ILSVRC where the target location tends to be a bounding box, the target locations in this case were a single pixel per plant (of which there could be zero or more in an image).
After some initial exploration, the Nervana team settled on using a modified version of a convolutional auto encoder2 algorithm for solving this problem. Convolutional autoencoders are a relatively recent development where a deconvolutional network attempts to reconstruct the original image from the output of one of the top layers of a standard convolutional network (Figure 2). These networks have an interesting property where the activations of the bottleneck layer can be tweaked to reconstruct stylistically meaningful variations of the input image. Researchers have used this type of approach to generate chairs3 and faces4, and for object localization.5 In our approach, we attempted to construct a target mask (same size as the input image, but with only the contact points colored in; Figure 3 right panel). We also used a simplified architecture where instead of pooling and unpooling layers as in we used strided convolution and deconvolution.
Figure 2: Architecture of the convolutional autoencoder. A series of convolution layers followed by deconvolution layers are used. In our neon implementation of the above model, we used RMSProp as the learning rule and Glorot Uniform as the weight initialization algorithm.
Figure 3. Left: Example of the input image with the mask applied. The mask is a square around the stem-ground contact point.
Right: Example output of the convolutional autoencoder with predictions of the corresponding stem-ground contact points
While detailed results will be shared in an academic publication at a later date, the Deep Learning system performed remarkably well in detecting the stem-ground contact points (see figure above for qualitative examples). Lee Redden, CTO and co-founder of Blue River summarizes the engagement and results, “Blue River operates robots across hundreds of different fields. Incorporating Nervana’s neon deep learning system into our pipeline allowed us to make use of the quantity of data that we’ve collected and increase reliability of our computer vision algorithms. When Blue River was starting to use deep learning for computer vision applications, we hired Nervana to help implement a system helping our internal researchers get up to speed quickly with a working platform customized for our application. Incorporating Nervana’s Neon into the system helped improve accuracy and overcome known failure cases.”
A common criticism against neural network approaches is that they are too black-box like and that it is hard to understand what is going on inside them. Neon has implemented recent approaches6 using which we could visualize what the autoencoder was learning (Figure 4). Reassuringly, the units in the top-most convolutional layer learned to respond most to an assortment of patterns corresponding to the stem-ground contact points.
Figure 4: Visualization of the parts of the images that most activated 4 feature maps in the top-most convolutional layer
If you would like to apply convolutional autoencoders to your object localization problem check out this example where we applied a similar network on the recent Kaggle right whale detection competition. Some other resources are below:
By helping Blue River with this use case, Nervana is advancing its mission of helping humanity. Human population is slated to hit 9 billion by 2040, and new solutions will be required to address fundamental needs such as food, clothing, shelter, education, healthcare, climate & energy, governance, and security. Check out nervanasys.com/products for more information about our solutions.
All information and pictures that are from Blue River Technology used with permission. This post incorporates work done by Anil Thomas, Augustus Odena, Anthony Ndirango and the entire algorithms team at Nervana. If you are interested in leveraging the power of deep learning for your company contact us at firstname.lastname@example.org.
 http://arxiv.org/abs/1412.6583 (this paper was published by Nervana and Berkeley researchers)