FlashBlade is the industry’s first data
hub. Built to consolidate data intensive workloads. In this video, we’ll demo how streaming analytics and AI can be consolidated on a single storage platform. Here we’re showing a Twitter feed being analyzed with various tools like Hadoop, Spark and AI applications all on a single data platform. FlashBlade Here I’m using NIFI to manage all the pipelines First we started collecting data from
Twitter. We’re telling NIFI to collect any
tweets that include key words like “dogs” or “puppies” And then who send the response to another process for simple JSON processing And then send the results to Solr for real-time indexing. Now if I go to my dashboard built on top of Solr, we can see real-time histograms of tweet data Such as who are the most active Twitter users, what languages they used, and of course the messages these top users tweeted and here you can see traffic being
generated in real-time by the Twitter feed we are also telling NIFI to send all
the raw data to the FlashBlade S3 bucket To do this, I’m telling NIFI that this is the end point to connect to FlashBlade So with this ingestion, we can now go to a Zeppelin notebook to analyze the data. here we’re telling HDFS to look at this S3 bucket and we’re telling HDFS to do some simple read operations. And below it you can see the raw JSON
responses. You’re also able to use Apache spark to read the JSON data and run various queries You can also convert JSON data to spark data frames and print its schema here. but maybe we’re more interested in analyzing who are the most influential users that like dogs and what they tweeted. you can easily do this what spark data frames Now you can do the same query using sequel. Here using spark sequel you can see the results visualized in the bar graphs so as you can see with a couple of clicks and a couple of lines of code, we’ve done
real-time data ingestion dash-boarding deep analytics on FlashBlade Now let’s move over to our AI for dog lovers. If I go back to my NIFI flow and click
into these groups, I’m basically telling NIFI to look at the tweet If there’s any pictures then go and download it and install the pictures on NFS mount And just for demo purposes, I am also sending the pictures to an S3 bucket. Now let me start this sub-flow. Now that it is collecting dog pictures, let’s move over to the S3 browser. And as you see we have the images of dogs from tweets being collected and stored on FlashBlade Now we can start applying the AI model to detect dogs here we’re using Jupiter notebook. And we’re using YOLO algorithm to help build an AI model that detects objects and we’ll use a pre-trained model which we’re loading in here as you can see this model has over 60 million parameters that’s a lot and that’s why you need GPUs to train these models. Now we’re ready to deploy the model into production and expose it as a REST API for our application to consume so I have my script to start the API server I’ve also downloaded several images for the test let’s start with the simple picture here. I’m using the script to send this picture to the REST API server to the left. it’ll take a few seconds to recognize the dogs in the pictures the results are stored in another directory over here I forgot to put this picture This is the original photo and this actually did a really good job
of detecting the dogs. And it not only tells me that it’s a dog but also where the dog is. Let me try in another picture. and I’ll send this to the API As you can see the picture really isn’t that high-resolution. But let’s see how the AI handles it. not bad! It tells me there’s a dog here
and a person here and another person here. Unfortunately it doesn’t tell me that there’s a small dog right here. well that concludes the demo. To summarize what we covered in this video, we showed how a single data hub, FlashBlade can consolidate all analytics applications from anew to spark to AI Head over to purestorage.com/datahub for more information Thanks for watching