Project Icon

MiniExcel

轻量高效的 .NET Excel 处理工具

MiniExcel 是一款轻量级的 .NET Excel 处理工具,通过流式处理算法大幅降低内存占用,有效避免内存溢出问题。该库支持实时处理数据,兼容 LINQ 延迟执行,可实现低消耗、快速分页等复杂查询。MiniExcel 无需依赖 Microsoft Office,DLL 大小仅 150KB,API 简洁易用,适合处理大型 Excel 文件。其高效率和低内存消耗特性使其成为 .NET 平台上理想的 Excel 处理解决方案。

NuGet Build status star GitHub stars version


This project is part of the .NET Foundation and operates under their code of conduct.



Your Star and Donate can make MiniExcel better

Introduction

MiniExcel is simple and efficient to avoid OOM's .NET processing Excel tool.

At present, most popular frameworks need to load all the data into the memory to facilitate operation, but it will cause memory consumption problems. MiniExcel tries to use algorithm from a stream to reduce the original 1000 MB occupation to a few MB to avoid OOM(out of memory).

image

Features

  • Low memory consumption, avoid OOM (out of memory) and full GC
  • Support real-time operation of each row of data
  • Support LINQ deferred execution, it can do low-consumption, fast paging and other complex queries
  • Lightweight, without Microsoft Office installed, no COM+, DLL size is less than 150KB
  • Easy API style to read/write/fill excel

Get Started

Installation

You can install the package from NuGet

Release Notes

Please Check Release Notes

TODO

Please Check TODO

Performance

Benchmarks logic can be found in MiniExcel.Benchmarks , and test cli

dotnet run -p .\benchmarks\MiniExcel.Benchmarks\ -c Release -f netcoreapp3.1 -- -f * --join

Output from the latest run is :

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
  Job-ZYYABG : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
IterationCount=3  LaunchCount=3  WarmupCount=3

Benchmark History : Link

Import/Query Excel

Logic : Test1,000,000x10.xlsx as performance test basic file, 1,000,000 rows * 10 columns "HelloWorld" cells, 23 MB file size

LibraryMethodMax Memory UsageMean
MiniExcel'MiniExcel QueryFirst'0.109 MB0.0007264 sec
ExcelDataReader'ExcelDataReader QueryFirst'15.24 MB10.66421 sec
MiniExcel'MiniExcel Query'17.3 MB14.17933 sec
ExcelDataReader'ExcelDataReader Query'17.3 MB22.56508 sec
Epplus'Epplus QueryFirst'1,452 MB18.19801 sec
Epplus'Epplus Query'1,451 MB23.64747 sec
OpenXmlSDK'OpenXmlSDK Query'1,412 MB52.00327 sec
OpenXmlSDK'OpenXmlSDK QueryFirst'1,413 MB52.34865 sec
ClosedXml'ClosedXml QueryFirst'2,158 MB66.18897 sec
ClosedXml'ClosedXml Query'2,184 MB191.43412 sec

Export/Create Excel

Logic : create a total of 10,000,000 "HelloWorld" excel

LibraryMethodMax Memory UsageMean
MiniExcel'MiniExcel Create Xlsx'15 MB11.53181 sec
Epplus'Epplus Create Xlsx'1,204 MB22.50971 sec
OpenXmlSdk'OpenXmlSdk Create Xlsx'2,621 MB42.47399 sec
ClosedXml'ClosedXml Create Xlsx'7,141 MB140.93992 sec

Excel Query/Import

1. Execute a query and map the results to a strongly typed IEnumerable [Try it]

Recommand to use Stream.Query because of better efficiency.

public class UserAccount
{
    public Guid ID { get; set; }
    public string Name { get; set; }
    public DateTime BoD { get; set; }
    public int Age { get; set; }
    public bool VIP { get; set; }
    public decimal Points { get; set; }
}

var rows = MiniExcel.Query<UserAccount>(path);

// or

using (var stream = File.OpenRead(path))
    var rows = stream.Query<UserAccount>();

image

2. Execute a query and map it to a list of dynamic objects without using head [Try it]

  • dynamic key is A.B.C.D..
MiniExcel1
Github2

var rows = MiniExcel.Query(path).ToList();

// or
using (var stream = File.OpenRead(path))
{
    var rows = stream.Query().ToList();

    Assert.Equal("MiniExcel", rows[0].A);
    Assert.Equal(1, rows[0].B);
    Assert.Equal("Github", rows[1].A);
    Assert.Equal(2, rows[1].B);
}

3. Execute a query with first header row [Try it]

note : same column name use last right one

Input Excel :

Column1Column2
MiniExcel1
Github2

var rows = MiniExcel.Query(useHeaderRow:true).ToList();

// or

using (var stream = File.OpenRead(path))
{
    var rows = stream.Query(useHeaderRow:true).ToList();

    Assert.Equal("MiniExcel", rows[0].Column1);
    Assert.Equal(1, rows[0].Column2);
    Assert.Equal("Github", rows[1].Column1);
    Assert.Equal(2, rows[1].Column2);
}

4. Query Support LINQ Extension First/Take/Skip ...etc

Query First

var row = MiniExcel.Query(path).First();
Assert.Equal("HelloWorld", row.A);

// or

using (var stream = File.OpenRead(path))
{
    var row = stream.Query().First();
    Assert.Equal("HelloWorld", row.A);
}

Performance between MiniExcel/ExcelDataReader/ClosedXML/EPPlus queryfirst

5. Query by sheet name

MiniExcel.Query(path, sheetName: "SheetName");
//or
stream.Query(sheetName: "SheetName");

6. Query all sheet name and rows

var sheetNames = MiniExcel.GetSheetNames(path);
foreach (var sheetName in sheetNames)
{
    var rows = MiniExcel.Query(path, sheetName: sheetName);
}

7. Get Columns

var columns = MiniExcel.GetColumns(path); // e.g result : ["A","B"...]

var cnt = columns.Count;  // get column count

8. Dynamic Query cast row to IDictionary<string,object>

foreach(IDictionary<string,object> row in MiniExcel.Query(path))
{
    //..
}

// or
var rows = MiniExcel.Query(path).Cast<IDictionary<string,object>>();
// or Query specified ranges (capitalized)
// A2 represents the second row of column A, C3 represents the third row of column C
// If you don't want to restrict rows, just don't include numbers
var rows = MiniExcel.QueryRange(path, startCell: "A2", endCell: "C3").Cast<IDictionary<string, object>>();

9. Query Excel return DataTable

Not recommended, because DataTable will load all data into memory and lose MiniExcel's low memory consumption feature.

var table = MiniExcel.QueryAsDataTable(path, useHeaderRow: true);

image

10. Specify the cell to start reading data

MiniExcel.Query(path,useHeaderRow:true,startCell:"B3")

image

11. Fill Merged Cells

Note: The efficiency is slower compared to not using merge fill

Reason: The OpenXml standard puts mergeCells at the bottom of the file, which leads to the need to foreach the sheetxml twice

    var config = new OpenXmlConfiguration()
    {
        FillMergedCells = true
    };
    var rows = MiniExcel.Query(path, configuration: config);

image

support variable length and width multi-row and column filling

image

12. Reading big file by disk-base cache (Disk-Base Cache - SharedString)

If the SharedStrings size exceeds 5 MB, MiniExcel default will use local disk cache, e.g, 10x100000.xlsx(one million rows data), when disable disk cache the maximum memory usage is 195MB, but able disk cache only needs 65MB. Note, this optimization needs some efficiency cost, so this case will increase reading time from 7.4 seconds to 27.2 seconds, If you don't need it that you can disable disk cache with the following code:

var config = new OpenXmlConfiguration { EnableSharedStringCache = false };
MiniExcel.Query(path,configuration: config)

You can use SharedStringCacheSize to change the sharedString file size beyond the specified size for disk caching

var config = new OpenXmlConfiguration { SharedStringCacheSize=500*1024*1024 };
MiniExcel.Query(path, configuration: config);

image

image

Create/Export Excel

  1. Must be a non-abstract type with a public parameterless constructor .

  2. MiniExcel support parameter IEnumerable Deferred Execution, If you want to use least memory, please do not call methods such as ToList

e.g : ToList or not memory usage image

1. Anonymous or strongly type [Try it]

var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
MiniExcel.SaveAs(path, new[] {
    new { Column1 = "MiniExcel", Column2 = 1 },
    new { Column1 = "Github", Column2 = 2}
});

2. IEnumerable<IDictionary<string, object>>

var values = new List<Dictionary<string, object>>()
{
    new Dictionary<string,object>{{ "Column1", "MiniExcel" }, { "Column2", 1 } },
    new Dictionary<string,object>{{ "Column1", "Github" }, { "Column2", 2 } }
};
MiniExcel.SaveAs(path, values);

Create File Result :

Column1Column2
MiniExcel1
Github2

3. IDataReader

  • Recommended, it can avoid to load all data into memory
MiniExcel.SaveAs(path, reader);

image

DataReader export multiple sheets (recommand by Dapper ExecuteReader)

using (var cnn = Connection)
{
    cnn.Open();
    var sheets = new Dictionary<string,object>();
    sheets.Add("sheet1", cnn.ExecuteReader("select 1 id"));
    sheets.Add("sheet2", cnn.ExecuteReader("select 2 id"));
    MiniExcel.SaveAs("Demo.xlsx", sheets);
}

4. Datatable

  • Not recommended, it will load all data into memory

  • DataTable use Caption for column name first, then use columname

var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
var table = new DataTable();
{
    table.Columns.Add("Column1", typeof(string));
    table.Columns.Add("Column2", typeof(decimal));
    table.Rows.Add("MiniExcel", 1);
    table.Rows.Add("Github", 2);
}

MiniExcel.SaveAs(path, table);

5. Dapper Query

Thanks @shaofing #552 , please use CommandDefinition + CommandFlags.NoCache

using (var connection = GetConnection(connectionString))
{
    var rows = connection.Query(
        new CommandDefinition(
            @"select 'MiniExcel' as Column1,1 as Column2 union all select 'Github',2"
            , flags: CommandFlags.NoCache)
        );
    // Note: QueryAsync will throw close connection exception
    MiniExcel.SaveAs(path, rows);
}

Below code will load all data into memory

using (var connection = GetConnection(connectionString))
{
    var rows = connection.Query(@"select 'MiniExcel' as Column1,1 as Column2 union all select
项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

吐司

探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

Project Cover

美间AI

美间AI创意设计平台,利用前沿AI技术,为设计师和营销人员提供一站式设计解决方案。从智能海报到3D效果图,再到文案生成,美间让创意设计更简单、更高效。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号