Python Automation Office: Seven Examples of Excel Operations

2024.05.29

In daily work, Excel is a powerful tool for data analysis and report making, and almost everyone knows it. However, when faced with a large number of repetitive data entry, sorting, and analysis tasks, manual operations are not only time-consuming and laborious, but also prone to errors. At this time, Python and its powerful library openpyxl come on the scene to help you easily realize Excel automation office.

Python, with its concise and easy-to-learn syntax and rich data processing libraries, can efficiently handle complex data tasks, and openpyxl is a library specifically used to operate Excel files, allowing you to complete creation, reading, editing and other operations without opening Excel.

Introduction and installation of openpyxl library

openpyxl is an open source Python library that focuses on reading and writing Microsoft Excel 2010 xlsx/xlsm/xltx/xltm files. It supports advanced features such as data validation, charts, hyperlinks, comments, etc., and has excellent performance and can handle large files with ease. To install, just enter the following command in the command line:

pip install openpyxl
  • 1.

Example 1: Create a new Excel file and write data

Want to create an Excel file out of nothing in Python? openpyxl can help you do it! The following code will create a file called "my_data.xlsx" and fill it with some data:

from openpyxl import Workbook

# 创建Workbook对象,相当于一个新的Excel文件
wb = Workbook()

# 获取默认的活跃工作表(Sheet)
ws = wb.active

# 在A1到C3单元格写入数据
for row in range(1, 4):
    for col in 'ABC':
        ws[col+str(row)] = f"Data {row} {col}"

# 保存到硬盘
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

Example 2: Reading existing Excel file data

Now that we have data, we need to be able to read it. This code will open "my_data.xlsx", read its contents and print them out:

from openpyxl import load_workbook

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")

# 获取第一个工作表(索引为0)
ws = wb.worksheets[0]

# 遍历所有单元格并打印内容
for row in ws.iter_rows():
    for cell in row:
        print(cell.value)
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.

Example 3: Modify cell content and style

The data is wrong? Unsatisfied with the color? Openpyxl allows you to easily adjust:

from openpyxl.styles import Font, Color

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")
ws = wb.active

# 修改A1单元格内容为"Updated Data"
ws["A1"] = "Updated Data"

# 设置A1单元格字体为红色,加粗
font = Font(color=Color(rgb='FF0000'), bold=True)
ws["A1"].font = font

# 保存更改
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

Example 4: Add/delete rows and columns

Need to adjust the data structure? Openpyxl allows you to add and delete freely:

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")
ws = wb.active

# 在第二行前插入一行
ws.insert_rows(2)

# 删除第三列(C列)
ws.delete_cols(3)

# 保存更改
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.

Example 5: Merge cells and set conditional formatting

Need a more beautiful table? Merge cells and conditional formatting can help:

from openpyxl.formatting.rule import CellIsRule
from openpyxl.styles import PatternFill

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")
ws = wb.active

# 合并A1到C1单元格
ws.merge_cells('A1:C1')

# 创建一个条件格式规则:当单元格值大于10时填充绿色
green_fill = PatternFill(start_color="00FF00", end_color="00FF00", fill_type="solid")
rule = CellIsRule(operator='>', formula=['10'], stopIfTrue=False, fill=green_fill)

# 应用规则到整个工作表
ws.conditional_formatting.add('A1:C5', rule)

# 保存更改
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.

Example 6: Using formula calculation and data statistics

Do you need to calculate or count the data? Openpyxl supports writing formulas directly:

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")
ws = wb.active

# 在D1单元格写入求和公式
ws["D1"] = "=SUM(A1:C1)"

# 在D2单元格写入平均值公式
ws["D2"] = "=AVERAGE(A2:C2)"

# 保存更改并计算公式结果
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.

Example 7: Batch processing of multiple worksheets

Openpyxl can also easily handle large files with multiple worksheets:

# 加载已有的Excel文件
wb = load_workbook("my_data.xlsx")

# 遍历所有工作表
for sheet in wb.worksheets:
    print(f"Processing sheet: {sheet.title}")

    # 对每个工作表执行相同的操作,如数据清洗、格式化等

# 保存更改
wb.save("my_data.xlsx")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.

Conclusion: Advanced learning and application scenario recommendations

Congratulations, through the above 7 examples, you have mastered the basic operations of openpyxl. Next, you can try to combine the pandas library for more complex data analysis, or use scheduled tasks to achieve automated report generation. Whether you are dealing with sales data, financial statements, or scientific research data, openpyxl can be your right-hand man to improve your work efficiency.