Novelty:
An algorithm that uses approximating large frequency moments with pick-and-drop sampling to analyze big data.
Value Proposition:
As the volume of data grows the ability to analyze the data becomes compromised. In some cases the data is generated by a single event and stored for analysis, such as from a large simulation. In others, the data is generated by singular concurrent events, such as daily sales data from Amazon. While each day's data may be efficiently analyzed, the size the aggregate data may be too big for practical analysis. Approximate frequency moments could be used to analyze Amazon's weekly or yearly sales figures when analysis of the data becomes impractical. The proposed invention thus focuses on frequency moments greater than or equal to 3, which is an improvement.
Technical Details:
Johns Hopkins researchers have developed an algorithm to approximately calculate the higher order ( n >= 3) frequency moments of a data stream. Given a data stream, the nth frequency moment is the sum of the occurrence frequencies of the unique elements raised to the nth power. They are useful to determine statistics on the data set as a whole, when the incoming data is too big to store, or to efficiently analyze once stored. The situations to which they are applicable are the situation in which data is streaming in and can be looked at exactly once when it appears in the stream.
Looking for Partners:
To develop and commercialize the technology as an algorithm for major data analysis.
Stage of Development:
Algorithm
Data Availability:
Under CDA / NDA
Publications/Associated Cases:
Not available at this time