Polarr photo editor without internet connection

#Polarr photo editor without internet connection series

The highest ranked frames are then in turn analyzed for scene recognitions and recommended filters are automatically applied to the frames. Furthermore, on this year’s Samsung Galaxy S20, the Single Take Mode exploits Polarr’s Best Moment and AI filter recommendation engines, where instead of user manually pressing the shutter to capture the best frame, deep learning models are leveraged to take in one continuous shot, then sorts and ranks the individual frames based on aesthetics and interestingness.

#Polarr photo editor without internet connection series

Our deep learning model takes in the feed from the ultra wide angle lens and performs a series of crops to find out the most interesting region, and in turn provides graphical prompts to the user to aim the camera at this crop before snapping the photo automatically. For instance, when the camera application is booted up on a Samsung Galaxy S10, the Shot Suggestion feature is enabled to guide the user to take the best composed frame out of the field of view. Over time, our researchers built a plethora of light-weight mobile deep learning models in the areas of real-time capturing, post-processing, curation and recommendations. Applications were limited, frameworks were buggy and crash-prone, and our team here at Polarr jumped straight into this opportunity to look for ways to build novel photography and videography use cases that could be carried out directly on a cellphone without the need of an internet connection, consumed very little memory, and also did not drain the battery from the power draw. This coined the concept of edge compute, where deep learning applications were carried out directly on the consumer hardware. Semiconductor and cellphone makers starting in 2016 began putting more and more dedicated compute units for neural networks such as Neural Processing Units (NPUs), Tensor Processing Units (TPUs), and other data-path optimized blocks onto the System-on-the-Chip for cellphones and IoT devices. This was a market opportunity that we needed to exploit. Before a model could even be deployed, thousands of dollars needed to be spent on cloud services in training, not to even mention the cost of actual inference. This cost would be intolerable for startups in the AI space with limited funding. The most economical option for the oldest generation cloud GPU instances on AWS still cost at least a dollar per compute hour.

Finally, cost was a big deterrent for this type of use cases. For realtime applications such as live video rendering, the delay caused by uploading, processing, and downloading the very data that was captured would be unacceptable. Secondly, there was the issue of latency. Moreover, one had to take the leap of faith that the service provider would not sell the personal data to bad actors or make an illicit profit at her loss. The transmission, storage and processing of this personal data had to be done in a secure and private manner. Large amount of personal information, photos and videos needed to be uploaded in order to be used for deep learning. Firstly, there was the concern over privacy. When a user went on Google Photos to search for a photo of her pet, the actual image processing was all done in the cloud.

This was in fact how majority of the photo applications’ AI capabilities were powered in the early days. In order to do image recognition, object detection, and segmentations, input frames must first be uploaded to the compute cloud over 3G network, processed in the GPU farms, and then transmitted back to the users for interactions. At the time, however, deep learning was a very computationally intensive and power hungry endeavor, one that was only suitable to run on large scale Graphics Processing Units (GPUs) that were in server farms and expensive mainframes. Deep learning and computer vision on the edge is the technology that enables Polarr’s photography applications in both the consumer and enterprise applications.īack in 2015, when we first started Polarr, there was already a buzz on Stanford campus over the excitement of deep learning and its endless applications in the visual perception and natural language processing prowesses.