Powered by OpenAIRE graph

Sublinear Algorithms for Approximating Probability Distributions

Funder: UK Research and InnovationProject code: EP/L021749/1
Funded under: EPSRC Funder Contribution: 98,776 GBP

Sublinear Algorithms for Approximating Probability Distributions

Description

The goal of this proposal is to advance a research program of developing sublinear-time algorithms for estimating a wide range of natural and important classes of probability distributions. We live in an era of "big data," where the amount of data that can be brought to bear on questions of biology, climate, economics, etc, is vast and expanding rapidly. Much of this raw data frequently consists of example points without corresponding labels. The challenge of how to make sense of this unlabeled data has immediate relevance and has rapidly become a bottleneck in scientific understanding across many disciplines. An important class of big data is most naturally modeled as samples from a probability distribution over a very large domain. The challenge of big data is that the sizes of the domains of the distributions are immense, typically resulting in unacceptably slow algorithms. Scaling up a computational framework to comfortably deal with ever-larger data presents a series of challenges in algorithms. This prompts the basic question: Given samples from some unknown distribution, what can we infer? While this question has been studied for several decades by various different communities of researchers, both the number of samples and running time required for such estimation tasks are not yet well understood, even for some surprisingly simple types of discrete distributions. The proposed research focuses on sublinear-time algorithms, that is, algorithms that run in time that is significantly less than the domain of the underlying distributions. In this project we will develop sublinear-time algorithms for estimating various classes of discrete distributions over very large domains. Specific problems we will address include: (1) Developing sublinear algorithms to estimate probability distributions that satisfy various natural types of "shape restrictions" on the underlying probability density function. (2) Developing sublinear algorithms for estimating complex distributions that result from the aggregation of many independent simple sources of randomness. We believe that highly efficient algorithms for these estimation tasks may play an important role for the next generation of large-scale machine learning applications.

Data Management Plans
Powered by OpenAIRE graph

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.

All Research products
arrow_drop_down
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=ukri________::b964fb75bb4e95a1c1be86575890f31c&type=result"></script>');
-->
</script>
For further information contact us at helpdesk@openaire.eu

No option selected
arrow_drop_down