Project Icon

numalogic

开源时间序列分析和异常检测框架

numalogic是一个开源的机器学习框架,专注于运营数据分析和AIOps。该框架集成了多种ML模型和算法,提供预测性数据分析、模型选择、数据处理和特征提取功能。numalogic适用于部署失败检测、系统故障识别、欺诈检测等场景。它支持实时训练,可根据输入数据自动更新模型,适合构建持续运行的ML平台。numalogic设计简洁,易于使用和扩展,为数据分析提供了灵活的解决方案。

numalogic

Build codecov black License slack Release Version

Background

Numalogic is a collection of ML models and algorithms for operation data analytics and AIOps. At Intuit, we use Numalogic at scale for continuous real-time data enrichment including anomaly scoring. We assign an anomaly score (ML inference) to any time-series datum/event/message we receive on our streaming platform (say, Kafka). 95% of our data sets are time-series, and we have a complex flowchart to execute ML inference on our high throughput sources. We run multiple models on the same datum, say a model that is sensitive towards +ve sentiments, another more tuned towards -ve sentiments, and another optimized for neutral sentiments. We also have a couple of ML models trained for the same data source to provide more accurate scores based on the data density in our model store. An ensemble of models is required because some composite keys in the data tend to be less dense than others, e.g., forgot-password interaction is less frequent than a status check interaction. At runtime, for each datum that arrives, models are picked based on a conditional forwarding filter set on the data density. ML engineers need to worry about only their inference container; they do not have to worry about data movement and quality assurance.

Numalogic realtime training

For an always-on ML platform, the key requirement is the ability to train or retrain models automatically based on the incoming messages. The composite key built at per message runtime looks for a matching model, and if the model turns out to be stale or missing, an automatic retriggering is applied. The conditional forwarding feature of the platform improves the development velocity of the ML developer when they have to make a decision whether to forward the result further or drop it after a trigger request.

Key Features

  1. Ease of use: simple and efficient tools for predictive data analytics
  2. Reusability: all the functionalities can be re-used in various contexts
  3. Model selection: easy to compare, validate, fine-tune and choose the model that works best with each data set
  4. Data processing: readily available feature extraction, scaling, transforming and normalization tools
  5. Extensibility: adding your own functions or extending over the existing capabilities
  6. Model Storage: out-of-the-box support for MLFlow and support for other model ML lifecycle management tools

Use Cases

  1. Deployment failure detection
  2. System failure detection for node failures or crashes
  3. Fraud detection
  4. Network intrusion detection
  5. Forecasting on time series data

Getting Started

For set-up information and running your first pipeline using numalogic, please see our getting started guide.

Installation

Numalogic requires Python 3.8 or higher.

Prerequisites

Numalogic needs PyTorch and PyTorch Lightning to work. But since these packages are platform dependendent, they are not included in the numalogic package itself. Kindly install them first.

Numalogic supports pytorch versions 2.0.0 and above.

numalogic can be installed using pip.

pip install numalogic

If using mlflow for model registry, install using:

pip install numalogic[mlflow]

Build locally

  1. Install Poetry:
    curl -sSL https://install.python-poetry.org | python3 -
    
  2. To activate virtual env:
    poetry shell
    
  3. To install dependencies:
    poetry install --with dev,torch
    
    If extra dependencies are needed:
    poetry install --all-extras
    
  4. To run unit tests:
    make test
    
  5. To format code style using black and ruff:
    make lint
    
  6. Setup pre-commit hooks:
    pre-commit install
    

Contributing

We would love contributions in the numalogic project in one of the following (but not limited to) areas:

  • Adding new time series anomaly detection models
  • Making it easier to add user's custom models
  • Support for additional model registry frameworks

For contribution guildelines please refer here.

Resources

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

白日梦AI

白日梦AI提供专注于AI视频生成的多样化功能,包括文生视频、动态画面和形象生成等,帮助用户快速上手,创造专业级内容。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

讯飞绘镜

讯飞绘镜是一个支持从创意到完整视频创作的智能平台,用户可以快速生成视频素材并创作独特的音乐视频和故事。平台提供多样化的主题和精选作品,帮助用户探索创意灵感。

Project Cover

讯飞文书

讯飞文书依托讯飞星火大模型,为文书写作者提供从素材筹备到稿件撰写及审稿的全程支持。通过录音智记和以稿写稿等功能,满足事务性工作的高频需求,帮助撰稿人节省精力,提高效率,优化工作与生活。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号