Jaspersoft CEO: We're driving pervasive business intelligence

Opening business intelligence to more users has long been a goal -- and Jaspersoft CEO Brian Gentile claims his cloud-based tools offer a breakthrough solution

1 2 3 Page 2
Page 2 of 3

The technologist who's setting it up on behalf of that end-user is saying that user only needs this type of data and this type of dataset; I don't want to confuse them with a lot of other things. I want to give them an explorative environment, but it's constrained based on that report dataset that's now delivered to them in the context that they need. That's very different than a power user who might need our full multidimensional analytic environment, which by the way is delivered in that very same browser with the exact same techniques, just a lot more capability, all exposed through HTML5, all scalable. The cost-value curve is identical, there's just a lot more functionality exposed. That person appreciates that functionality because they know what to do with a pivot table.

Q: So it's one license for all this technology. You use different parts of it in different parts of your organization depending on what the roles are?

A: This is correct. How do you reach a very broad audience, satisfy them so they are having a great BI experience and, more important, helping to ensure they make decisions based on data? That's fundamentally what the BI equation has missed for the last 20 years: how do you get to the 85 percent of the organization that's been forsaken with BI? How do you reach them with a tool that is context sensitive, delivers just the right amount of BI, within an environment that's comfortable to them, that's affordable? Anybody who doesn't understand that is really missing the boat. If you're going to deliver BI to 100,000 employees inside of Procter & Gamble, it's got to be affordable. No CIO in his right mind is going to deliver something that's tens of millions of dollars. That's the mission we've been on for seven years, and that's what distinguishes us. I don't know anybody else in BI who's after this mission.

Q: SAP provided a significant chunk of your funding?

A: They are a multitime investor who began their investment in about 2007.

Q: So you listed them as a competitor. Why would they fund a competitor?

A: They actually made their first investment in Jaspersoft just prior to the acquisition of Business Objects. But I believe that even if they had acquired Business Objects and Crystal Reports as part of that acquisition they would have made the investment anyway. Why? We're fundamentally different. Anybody who's evaluating Jaspersoft versus Business Objects has probably made a mistake somewhere along the line. It was a good investment before or after that acquisition because we're solving very different BI problems.

Q: Going back to the big guys, how would you compare the depth of analytics capabilities you provide versus the traditional players?

A: Seven years ago we made this decision to deliver everything inside of a Web browser, and that was a fundamental constraint, especially back then. It is still today, but less so.

The other companies with which we compete mostly are many years older than us, and they got quite a head start in building features and functionality. Our goal today is to deliver a very high percentage of the features and functions at a fraction of the cost. Our goal is to provide 70 to 80 percent of functionality, the vast majority of what anybody really needs in that level of BI, for 10 to 20 percent of the cost. We believe that for a lot of use cases, customers are going to say, "I'll take that."

Similarly, in the visualization world, where you're talking about visualizing a lot of data very quickly and easily, our goal is to provide 70 to 80 percent of Tableau at 10 to 20 percent of the cost and deliver it entirely inside of a Web browser, fundamentally changing that cost curve as well. In fact, there's a percentage of functionality that we don't even want to go to. I mean, I don't know if it's 20 to 30 percent, but I don't think I ever want 100 percent of the functionality. Because like Microsoft Word, such a small percentage of the audience uses it, and it makes your product so much more complicated that I don't want to go there.

Q: How does big data change the market for you?

A: Big data changes a bunch of things, and it's so important that I've said in a few years we'll just call it data again, because it will be so primary, so elemental. Being able to analyze variable data types at a higher speed, higher velocity, is really important. The fact that there are bigger volumes of data isn't so important, it's really about the variability of data, new data types, and the velocity at which they must be used before the data goes stale. Those are the two distinguishing characteristics of big data.

When we built our server architecture starting seven years ago, we never assumed that the world would be full of exclusively structured relational data. We always knew that data would have greater variety rather than less. So we made it easy right from the very start to connect to, and analyze or report on any type of data. When these big data types started to become more pronounced in the last three years, we've been the first to not only embrace them, but make tools available, connections available to them natively, so that you can really exploit them.

Our first connector into what was little known as Hadoop back then used both Hive and HBase. We even dabbled in HDFS and Avro back in the day, not knowing what customers would prefer. Now we have not only all those different pieces of Hadoop as framework software, but we have all these different NewSQL and NoSQL data types. We have advanced our connectivity and intelligence into these relatively abstract data types. We need to connect to these and treat them as if they were structured and relational, with enough intelligence. We've made these native connections available -- and by native I mean directly from our BI server, you don't have to use our ETL [Extract, Transform, Load] tool. ETL is another option, and it's a prominent option for a lot of customers, but you don't have to use it because ETL introduces latency by definition.

Q: How does using unstructured data change the users and usage of the product?

A: Maybe the most encouraging thing about the whole big data movement is that it will force the business side and IT to come together. There's no way you can create a really compelling application of big data without the two groups working in concert. The domain knowledge that comes from the business side is the catalyst. The domain knowledge is what's so vital. They need to know what data exists out there, if it can be properly structured and analyzed what business insight it would yield and what advantage would be possessed by the company.Business has to drive this.

These technologies are not straightforward. There is a lot going on here, and anybody who tries to oversimplify it is doing a disservice. It requires a technologist to sit down, whether it's Cassandra or Mongo or Couch or Volt or any of the others, or Hadoop ... that's a fundamental decision that should be made between both the business users and the technical team. The choice there is elemental to whether or not you're going to be successful.

The biggest mistake we see in big data projects is the underlying data source technology is mistakenly chosen. They choose Hadoop because that's what they know or they've heard of, when it really would have been better to use a document-oriented data store or a key value store. Now they're in a corner, and they look like they're failing, but they wouldn't be failing if they had just started by asking what the business requirements are. What amount of latency is acceptable? What type of data? Now it's just more important -- because the volume is so much greater and the business insight can be so much more valuable -- to bring business and IT together to make the right fundamental choices. It's really a powerful time again in this world of data and BI, because it's another fundamental reason that the two teams have to sit down and come to agreement and think about this from one viewpoint rather than mixed viewpoints.

Q: What about real-time analytics?

A: It's an overused term. I mean, "real time" means something different to different groups. If you're sitting down with a business audience and they're trying to solve a problem, they might call something real time that has a three-second delay in the data from capture to display. Technically that's not real time, but for them it's real time. I just try to be precise about it. Real time literally means there's zero delay, and there are very few applications of that. It's not good or bad, it all depends on the business application.

Q: You mentioned your latest release, 5.0. What's new and great about it?

A: It builds on this foundation that I've been talking about. Cloud-based BI, which I didn't mention, where we exist in the cloud today and all the major environments, whether it's the Amazon Web Service Marketplace or Cloud Foundry or Red Hat's OpenShift. We have made announcements with all those and more. We're in the cloud. We've been in the cloud longer than anybody. We've been doing the big data dance longer than anybody. Our architecture has adhered to all this, and we want to amplify the architectural choices we've made to the benefit of our customers. Everything must continue to be in the Web browser, everything must be server-based and scalable, at Web proportions, and affordable.

Version 5 brings together a vision we had three or more years ago. It's just that HTML5 wasn't available then. We started playing with the early versions of HTML5, but it wasn't anywhere near ready for usage. Now, Version 5 brings together a brand new graphing and visualization engine that is entirely built on HTML5. That gives us a level of interaction, animation, and visualization inside the Web browser that would be very similar to a Tableau-like experience, and we're just starting. So the first major feature is a consistent and high-performing visualization engine that's entirely built on HTML5.

It lets you build beautiful animated charts, with all sorts of interaction, mouse over, feedback, zoom in. We built a new tool we call the dimensional zoom tool, which looks like a slider bar, and when you slide, it's like zooming in on a camera. You're actually zooming into more detail on the data, exposing more dimensions technically. When you zoom out you're aggregating, or summarizing, fewer dimensions. Remember my spectrum earlier? For some percentage of end-users, this is data analysis. All you're doing is providing a highly interactive visual experience inside the browser, and for them, they're analyzing data. It's not OLAP, but it's for them analyzing data. For another user, we'll provide them with that same experience wrapped around an OLAP on the back end. So they're getting that same beautiful visualization, ability to explore in a cross-tab environment, they're charting it, and it's identical, it's just a different, very rich underlying dataset.

Data virtualization is number two. We've built a complete virtualized data environment that allows you to leave the data where it is, in all of its various sources scattered across the enterprise, but structure a query using our virtualization engine that goes after all that data, pulls back the results, aggregates and displays the results in-memory for the user to explore inside of that visualization environment I described. In fact, we have the ability to describe that virtual data semantically. We have the ability to capture metadata about those virtualized data sources. Not only can you do what I just described by leaving the data where it is, but you can express those data views in English-like terms, where a novice user can now drag and drop on a canvas their own reports that are built on these virtualized data views, not knowing or caring where that data actually exists. Maybe there's some from Hadoop, maybe there's a bunch of stuff in Cassandra, there's an Oracle database, there's a Teradata warehouse, whatever, and all those views are being aggregated by our virtualization engine, expressed in English-like terms, visualized in an in-memory engine that I'm going to get to as my third feature. But that's data virtualization.

The third feature is the in-memory capability. We've long had an in-memory engine. In Version 5, we've vastly expanded it. It's our own columnar in-memory engine. It's non-persistent, so it exists as long as the end user state exists. But what it allows is exploration of that data at memory speeds, and the dataset can be extremely large. With Version 5, that's a full terabyte of data. If your server is properly configured, you could return billions of rows of data from any data source using that virtualizer technology, hold it in-memory, and express it inside of this visualization that I described earlier. All done at sub-second in-memory speeds because it's all in-memory and it's addressed with a columnar orientation.

The fourth is advancements to our analytic capability. If you are a power user and you want to use richer multidimensional analytics, we've increased the feature set, getting closer to that 80 percent, if you will. We've increased the mathematics behind it, the performance of it, and we've added the HTML5 engine I mentioned earlier so you get this consistent experience from one end of the product to the other for visualizing it. Out of the box, we provide a backend OLAP engine called Mondrian, which is open source, very capable. But some of our customers have investments in other OLAP environments, and one of the most popular is Analysis Services from Microsoft. Our tool now natively supports Microsoft SQL Server Analysis Services. You now can point our visualization environment, our analytic front-end at Analysis Services from Microsoft and you get the same results, the same front end experience, which is a pretty clever technique to help our customers maintain investments in those.

1 2 3 Page 2
Page 2 of 3