Project Icon

pymc

Python贝叶斯统计建模与概率编程框架

PyMC是一个Python贝叶斯统计建模框架,专注于高级马尔可夫链蒙特卡洛和变分推断算法。它提供直观的模型语法、强大的采样算法和推断功能,可处理复杂模型。PyMC利用PyTensor优化计算,支持缺失值处理,并提供丰富的示例资源。作为一个灵活的概率编程工具,PyMC适用于广泛的统计建模任务。

.. image:: https://cdn.rawgit.com/pymc-devs/pymc/main/docs/logos/svg/PyMC_banner.svg :height: 100px :alt: PyMC logo :align: center

|Build Status| |Coverage| |NumFOCUS_badge| |Binder| |Dockerhub| |DOIzenodo| |Conda Downloads|

PyMC (formerly PyMC3) is a Python package for Bayesian statistical modeling focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.

Check out the PyMC overview <https://docs.pymc.io/en/latest/learn/core_notebooks/pymc_overview.html>, or one of the many examples <https://www.pymc.io/projects/examples/en/latest/gallery.html>! For questions on PyMC, head on over to our PyMC Discourse <https://discourse.pymc.io/>__ forum.

Features

  • Intuitive model specification syntax, for example, x ~ N(0,1) translates to x = Normal('x',0,1)
  • Powerful sampling algorithms, such as the No U-Turn Sampler <http://www.jmlr.org/papers/v15/hoffman14a.html>__, allow complex models with thousands of parameters with little specialized knowledge of fitting algorithms.
  • Variational inference: ADVI <http://www.jmlr.org/papers/v18/16-107.html>__ for fast approximate posterior estimation as well as mini-batch ADVI for large data sets.
  • Relies on PyTensor <https://pytensor.readthedocs.io/en/latest/>__ which provides:
    • Computation optimization and dynamic C or JAX compilation
    • NumPy broadcasting and advanced indexing
    • Linear algebra operators
    • Simple extensibility
  • Transparent support for missing value imputation

Linear Regression Example

Plant growth can be influenced by multiple factors, and understanding these relationships is crucial for optimizing agricultural practices.

Imagine we conduct an experiment to predict the growth of a plant based on different environmental variables.

.. code-block:: python

import pymc as pm

Taking draws from a normal distribution

seed = 42 x_dist = pm.Normal.dist(shape=(100, 3)) x_data = pm.draw(x_dist, random_seed=seed)

Independent Variables:

Sunlight Hours: Number of hours the plant is exposed to sunlight daily.

Water Amount: Daily water amount given to the plant (in milliliters).

Soil Nitrogen Content: Percentage of nitrogen content in the soil.

Dependent Variable:

Plant Growth (y): Measured as the increase in plant height (in centimeters) over a certain period.

Define coordinate values for all dimensions of the data

coords={ "trial": range(100), "features": ["sunlight hours", "water amount", "soil nitrogen"], }

Define generative model

with pm.Model(coords=coords) as generative_model: x = pm.Data("x", x_data, dims=["trial", "features"])

  # Model parameters
  betas = pm.Normal("betas", dims="features")
  sigma = pm.HalfNormal("sigma")

  # Linear model
  mu = x @ betas

  # Likelihood
  # Assuming we measure deviation of each plant from baseline
  plant_growth = pm.Normal("plant growth", mu, sigma, dims="trial")

Generating data from model by fixing parameters

fixed_parameters = { "betas": [5, 20, 2], "sigma": 0.5, } with pm.do(generative_model, fixed_parameters) as synthetic_model: idata = pm.sample_prior_predictive(random_seed=seed) # Sample from prior predictive distribution. synthetic_y = idata.prior["plant growth"].sel(draw=0, chain=0)

Infer parameters conditioned on observed data

with pm.observe(generative_model, {"plant growth": synthetic_y}) as inference_model: idata = pm.sample(random_seed=seed)

  summary = pm.stats.summary(idata, var_names=["betas", "sigma"])
  print(summary)

From the summary, we can see that the mean of the inferred parameters are very close to the fixed parameters

===================== ====== ===== ======== ========= =========== ========= ========== ========== ======= Params mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat ===================== ====== ===== ======== ========= =========== ========= ========== ========== ======= betas[sunlight hours] 4.972 0.054 4.866 5.066 0.001 0.001 3003 1257 1 betas[water amount] 19.963 0.051 19.872 20.062 0.001 0.001 3112 1658 1 betas[soil nitrogen] 1.994 0.055 1.899 2.107 0.001 0.001 3221 1559 1 sigma 0.511 0.037 0.438 0.575 0.001 0 2945 1522 1 ===================== ====== ===== ======== ========= =========== ========= ========== ========== =======

.. code-block:: python

Simulate new data conditioned on inferred parameters

new_x_data = pm.draw( pm.Normal.dist(shape=(3, 3)), random_seed=seed, ) new_coords = coords | {"trial": [0, 1, 2]}

with inference_model: pm.set_data({"x": new_x_data}, coords=new_coords) pm.sample_posterior_predictive( idata, predictions=True, extend_inferencedata=True, random_seed=seed, )

pm.stats.summary(idata.predictions, kind="stats")

The new data conditioned on inferred parameters would look like:

================ ======== ======= ======== ========= Output mean sd hdi_3% hdi_97% ================ ======== ======= ======== ========= plant growth[0] 14.229 0.515 13.325 15.272 plant growth[1] 24.418 0.511 23.428 25.326 plant growth[2] -6.747 0.511 -7.740 -5.797 ================ ======== ======= ======== =========

.. code-block:: python

Simulate new data, under a scenario where the first beta is zero

with pm.do( inference_model, {inference_model["betas"]: inference_model["betas"] * [0, 1, 1]}, ) as plant_growth_model: new_predictions = pm.sample_posterior_predictive( idata, predictions=True, random_seed=seed, )

pm.stats.summary(new_predictions, kind="stats")

The new data, under the above scenario would look like:

================ ======== ======= ======== ========= Output mean sd hdi_3% hdi_97% ================ ======== ======= ======== ========= plant growth[0] 12.149 0.515 11.193 13.135 plant growth[1] 29.809 0.508 28.832 30.717 plant growth[2] -0.131 0.507 -1.121 0.791 ================ ======== ======= ======== =========

Getting started

If you already know about Bayesian statistics:

  • API quickstart guide <https://www.pymc.io/projects/examples/en/latest/introductory/api_quickstart.html>__
  • The PyMC tutorial <https://docs.pymc.io/en/latest/learn/core_notebooks/pymc_overview.html>__
  • PyMC examples <https://www.pymc.io/projects/examples/en/latest/gallery.html>__ and the API reference <https://docs.pymc.io/en/stable/api.html>__

Learn Bayesian statistics with a book together with PyMC

  • Bayesian Analysis with Python <http://bap.com.ar/>__ (third edition) by Osvaldo Martin: Great introductory book.
  • Probabilistic Programming and Bayesian Methods for Hackers <https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers>__: Fantastic book with many applied code examples.
  • PyMC port of the book "Doing Bayesian Data Analysis" by John Kruschke <https://github.com/cluhmann/DBDA-python>__ as well as the first edition <https://github.com/aloctavodia/Doing_bayesian_data_analysis>__.
  • PyMC port of the book "Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath <https://github.com/pymc-devs/resources/tree/master/Rethinking>__
  • PyMC port of the book "Bayesian Cognitive Modeling" by Michael Lee and EJ Wagenmakers <https://github.com/pymc-devs/resources/tree/master/BCM>__: Focused on using Bayesian statistics in cognitive modeling.

Audio & Video

  • Here is a YouTube playlist <https://www.youtube.com/playlist?list=PL1Ma_1DBbE82OVW8Fz_6Ts1oOeyOAiovy>__ gathering several talks on PyMC.
  • You can also find all the talks given at PyMCon 2020 here <https://discourse.pymc.io/c/pymcon/2020talks/15>__.
  • The "Learning Bayesian Statistics" podcast <https://www.learnbayesstats.com/>__ helps you discover and stay up-to-date with the vast Bayesian community. Bonus: it's hosted by Alex Andorra, one of the PyMC core devs!

Installation

To install PyMC on your system, follow the instructions on the installation guide <https://www.pymc.io/projects/docs/en/latest/installation.html>__.

Citing PyMC

Please choose from the following:

  • |DOIpaper| PyMC: A Modern and Comprehensive Probabilistic Programming Framework in Python, Abril-Pla O, Andreani V, Carroll C, Dong L, Fonnesbeck CJ, Kochurov M, Kumar R, Lao J, Luhmann CC, Martin OA, Osthege M, Vieira R, Wiecki T, Zinkov R. (2023)
  • |DOIzenodo| A DOI for all versions.
  • DOIs for specific versions are shown on Zenodo and under Releases <https://github.com/pymc-devs/pymc/releases>_

.. |DOIpaper| image:: https://img.shields.io/badge/DOI-10.7717%2Fpeerj--cs.1516-blue.svg :target: https://doi.org/10.7717/peerj-cs.1516 .. |DOIzenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.4603970.svg :target: https://doi.org/10.5281/zenodo.4603970

Contact

We are using discourse.pymc.io <https://discourse.pymc.io/>__ as our main communication channel.

To ask a question regarding modeling or usage of PyMC we encourage posting to our Discourse forum under the “Questions” Category <https://discourse.pymc.io/c/questions>. You can also suggest feature in the “Development” Category <https://discourse.pymc.io/c/development>.

You can also follow us on these social media platforms for updates and other announcements:

  • LinkedIn @pymc <https://www.linkedin.com/company/pymc/>__
  • YouTube @PyMCDevelopers <https://www.youtube.com/c/PyMCDevelopers>__
  • X @pymc_devs <https://x.com/pymc_devs>__
  • Mastodon @pymc@bayes.club <https://bayes.club/@pymc>__

To report an issue with PyMC please use the issue tracker <https://github.com/pymc-devs/pymc/issues>__.

Finally, if you need to get in touch for non-technical information about the project, send us an e-mail <info@pymc-devs.org>__.

License

Apache License, Version 2.0 <https://github.com/pymc-devs/pymc/blob/main/LICENSE>__

Software using PyMC

General purpose

  • Bambi <https://github.com/bambinos/bambi>__: BAyesian Model-Building Interface (BAMBI) in Python.
  • calibr8 <https://calibr8.readthedocs.io>__: A toolbox for constructing detailed observation models to be used as likelihoods in PyMC.
  • gumbi <https://github.com/JohnGoertz/Gumbi>__: A high-level interface for building GP models.
  • SunODE <https://github.com/aseyboldt/sunode>__: Fast ODE solver, much faster than the one that comes with PyMC.
  • pymc-learn <https://github.com/pymc-learn/pymc-learn>__: Custom PyMC models built on top of pymc3_models/scikit-learn API

Domain specific

  • Exoplanet <https://github.com/dfm/exoplanet>__: a toolkit for modeling of transit and/or radial velocity observations of exoplanets and other astronomical time series.
  • beat <https://github.com/hvasbath/beat>__: Bayesian Earthquake Analysis Tool.
  • CausalPy <https://github.com/pymc-labs/CausalPy>__: A package focussing on causal inference in quasi-experimental settings.

Please contact us if your software is not listed here.

Papers citing PyMC

See Google Scholar here <https://scholar.google.com/scholar?cites=6357998555684300962>__ and here <https://scholar.google.com/scholar?cites=6936955228135731011>__ for a continuously updated list.

Contributors

See the GitHub contributor page <https://github.com/pymc-devs/pymc/graphs/contributors>. Also read our Code of Conduct <https://github.com/pymc-devs/pymc/blob/main/CODE_OF_CONDUCT.md> guidelines for a better contributing experience.

Support

PyMC is a non-profit project under NumFOCUS umbrella. If you want to support PyMC financially, you can donate here <https://numfocus.salsalabs.org/donate-to-pymc3/index.html>__.

Professional Consulting Support

You can get professional consulting support from PyMC Labs <https://www.pymc-labs.io>__.

Sponsors

|NumFOCUS|

|PyMCLabs|

|Mistplay|

|ODSC|

Thanks to our contributors

|contributors|

.. |Binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/pymc-devs/pymc/main?filepath=%2Fdocs%2Fsource%2Fnotebooks .. |Build Status| image:: https://github.com/pymc-devs/pymc/workflows/pytest/badge.svg :target: https://github.com/pymc-devs/pymc/actions .. |Coverage| image:: https://codecov.io/gh/pymc-devs/pymc/branch/main/graph/badge.svg :target: https://codecov.io/gh/pymc-devs/pymc .. |Dockerhub| image:: https://img.shields.io/docker/automated/pymc/pymc.svg :target: https://hub.docker.com/r/pymc/pymc .. |NumFOCUS_badge| image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: http://www.numfocus.org/ .. |NumFOCUS| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_numfocus.png?raw=true :target: http://www.numfocus.org/ .. |PyMCLabs| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_pymc_labs.png?raw=true :target: https://pymc-labs.io .. |Mistplay| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_mistplay.png?raw=true :target: https://www.mistplay.com/ .. |ODSC| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/odsc/sponsor_odsc.png?raw=true :target: https://odsc.com/california/?utm_source=pymc&utm_medium=referral .. |contributors| image:: https://contrib.rocks/image?repo=pymc-devs/pymc :target: https://github.com/pymc-devs/pymc/graphs/contributors .. |Conda Downloads| image:: https://anaconda.org/conda-forge/pymc/badges/downloads.svg :target:

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

吐司

探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

Project Cover

美间AI

美间AI创意设计平台,利用前沿AI技术,为设计师和营销人员提供一站式设计解决方案。从智能海报到3D效果图,再到文案生成,美间让创意设计更简单、更高效。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号