EXCEL-TO-MARKDOWN

EXCEL-TO-MARKDOWN is a robust Python tool designed to convert Excel files (.xlsx and .xls) into well-formatted Markdown tables. Leveraging a modular architecture, this tool offers enhanced table detection capabilities, interactive prompts for handling complex Excel layouts, and seamless integration with various project workflows.

🛠️ Features

Automated Table Detection: Identifies the first fully populated row as the table header, ensuring accurate Markdown conversion.
Interactive Mode: Prompts users to specify table regions when automatic detection fails, handling complex and irregular Excel structures.
Modular Design: Organized into distinct modules for detection, parsing, Markdown generation, and utilities, promoting maintainability and scalability.
Supports Multiple Sheets: Processes all sheets within an Excel file, generating separate Markdown files for each.
Flexible Column Specification: Allows users to define column ranges using both letter-based (e.g., A:D) and number-based (e.g., 1-4) inputs.
Unit Tested: Comprehensive unit tests ensure reliability and facilitate future enhancements.
Easy Integration: Compatible with Poetry for dependency management and can be integrated into larger projects or CI/CD pipelines.

📁 Project Structure

EXCEL-TO-MARKDOWN
│
├── .venv
├── data
│   ├── input
│   └── output
├── docs
├── excel_to_markdown
│   ├── __init__.py
│   ├── main.py
│   ├── detector.py
│   ├── parser.py
│   ├── markdown_generator.py
│   └── utils.py
├── src
├── tests
│   ├── test_detector.py
│   ├── test_parser.py
│   ├── test_markdown_generator.py
│   └── test_main.py
├── .gitignore
├── LICENSE
├── poetry.lock
├── pyproject.toml
└── readme.md

Module Breakdown

excel_to_markdown/
- main.py: Entry point of the application. Handles argument parsing, orchestrates the workflow, and manages file I/O.
- detector.py: Contains functions related to detecting the table start within Excel sheets.
- parser.py: Handles parsing user inputs, such as column specifications.
- markdown_generator.py: Responsible for converting pandas DataFrames to Markdown format.
- utils.py: Utility functions like column letter to index conversion and filename sanitization.
tests/
- test_detector.py
- test_parser.py
- test_markdown_generator.py
- test_main.py
Each test file contains unit tests for their respective modules, ensuring functionality and reliability.

🚀 Installation

You can install excel-to-markdown directly from this repository using pip:

pip install git+https://github.com/devin-liu/excel-to-markdown.git

Note: This assumes the repository URL is github.com/devin-liu/excel-to-markdown.

For Development

If you want to contribute to the project, it is recommended to use Poetry for managing dependencies and the development environment.

Clone the repository:

git clone https://github.com/devin-liu/excel-to-markdown.git
cd excel-to-markdown

Install dependencies with Poetry:
```
poetry install
```

This will create a virtual environment and install all the necessary dependencies.

📋 Usage

Preparing Your Data

Input Directory: Place all your Excel files (.xlsx or .xls) in the data/input directory.
Output Directory: The converted Markdown files will be saved in the data/output directory by default. If this directory doesn't exist, the script will create it.

data/input: Directory containing your Excel files.
data/output: (Optional) Directory where Markdown files will be saved. If not specified, an output folder will be created inside the input directory.

Running the Localhost Server

You can also start a localhost server for real-time editing using the app command:

app

This will start a server on your localhost, allowing you to make edits to your spreadsheets locally and see immediate updates.

Running the CLI Script

Execute the main script over CLI using the excel-to-markdown command:

excel-to-markdown data/input data/output

Interactive Prompts

For each sheet in each Excel file:

Automatic Detection:
- The script attempts to detect the header row based on the enhanced logic (first fully populated row).
- If successful, it proceeds to convert without prompts.
Manual Specification:
- If automatic detection fails, you'll be prompted to enter:
  - Header Row Number: The row where your table headers are located (1-based index).
  - Columns to Include: Specify the range of columns, e.g., A:D or 1-4.

Sample Interaction:

Processing sheet: 'Sales Data' in file 'report1.xlsx'
Automatically detected table starting at row 2.
Markdown file 'report1_Sales_Data.md' for sheet 'Sales Data' has been created successfully.

Processing sheet: 'Summary' in file 'report1.xlsx'
Automatic table detection failed.
Enter the header row number (1-based index): 5
Enter the columns to include (e.g., A:D or 1-4): B:E
Markdown file 'report1_Summary.md' for sheet 'Summary' has been created successfully.

🧩 Contributing

Contributions are welcome! To contribute:

Fork the Repository
Create a Feature Branch
```
git checkout -b feature/YourFeatureName
```
Commit Your Changes
```
git commit -m "Add some feature"
```
Push to the Branch
```
git push origin feature/YourFeatureName
```
Open a Pull Request

Please ensure that your contributions adhere to the existing code style and include relevant tests.

🧪 Testing

Unit tests are located in the tests/ directory. To run the tests, first install the development dependencies:

pip install -e .[dev]

Then run pytest:

pytest

For contributors using Poetry, you can still run the tests with:

poetry run pytest

📜 License

This project is licensed under the GPLv3.

📧 Contact

For any inquiries or support, please contact [email protected].

Happy Converting! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EXCEL-TO-MARKDOWN

🛠️ Features

📁 Project Structure

Module Breakdown

🚀 Installation

For Development

📋 Usage

Preparing Your Data

Running the Localhost Server

Running the CLI Script

Interactive Prompts

🧩 Contributing

🧪 Testing

📜 License

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
excel_to_markdown		excel_to_markdown
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_streamlit.py		run_streamlit.py
setup.py		setup.py

License

devin-liu/excel-to-markdown

Folders and files

Latest commit

History

Repository files navigation

EXCEL-TO-MARKDOWN

🛠️ Features

📁 Project Structure

Module Breakdown

🚀 Installation

For Development

📋 Usage

Preparing Your Data

Running the Localhost Server

Running the CLI Script

Interactive Prompts

🧩 Contributing

🧪 Testing

📜 License

📧 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages