Groundbreaking Parallel Data Mining Results with The University ofTexas Spotlighted in Breakout Session
Oct. 15, 2009 (Business Wire) -- Pervasive Software® Inc. (NASDAQ: PVSW), an emerging leader in enabling next-generation data-intensive applications and data analytics, announced today that it will be demonstrating its recommender system at Predictive Analytics World (PAW) October 20-21 at the Hilton Alexandria Mark Center in Alexandria, Virginia.
“One of the most interesting benchmarks today is the Netflix dataset, with more than 100 million movie reviews created by more than 480,000 users and covering a set of more than 17,000 movies,” said Pervasive Innovation Labs Director Nena Marín, Ph.D. “In the process of experimentation and research on parallel data mining, we discovered compelling results with Pervasive DataRush. We ran a k-means clustering algorithm on the entire Netflix dataset with k=30 in just 17 seconds, and significantly, that was on a $2,500 commodity 8-core server. These experiments evolved into a robust, high-performance, collaborative, filtering-based recommender system, which we will demonstrate at Predictive Analytics World.”
Srivatsava Daruru of the IDEAL Laboratory at The University of Texas at Austin will also present joint research conducted with Pervasive at the conference in a presentation titled, “Churn, Baby, Churn: Fast Scoring on Large Telecom Dataset - KDD Cup 2009 Competition Results: Orange Labs (France Telecom).” Daruru’s presentation, on October 20 at 4:55 p.m. in Walnut A/B, details application of the dataflow computational model in the context of data and computationally intensive high performance parallel data mining. The Pervasive-University of Texas team created a scalable and robust model capable of scoring "propensity-to-churn" at the rate of 50,000 customers in a 1.6GB test set (Orange Labs France Telecom, KDD Cup) in just three minutes on commodity 16-core CPUs, yielding an effective scoring runtime of 3.6 milliseconds per customer, orders of magnitude faster than many systems.
“Both within Pervasive and in joint efforts with partners and with The University of Texas, we continue to deliver astounding parallel data mining and predictive analytics performance using Pervasive DataRush,” said Mike Hoskins, Pervasive CTO.